diff --git a/documentation/content/en/articles/geom-class/_index.adoc b/documentation/content/en/articles/geom-class/_index.adoc index cf10202e74..7613cb780a 100644 --- a/documentation/content/en/articles/geom-class/_index.adoc +++ b/documentation/content/en/articles/geom-class/_index.adoc @@ -1,422 +1,422 @@ --- title: Writing a GEOM Class authors: - author: Ivan Voras email: ivoras@FreeBSD.org description: A guide to GEOM internals, and writing your own GEOM class trademarks: ["freebsd", "intel", "general"] tags: ["GEOM", "kernel", "modules", "FreeBSD"] --- = Writing a GEOM Class :doctype: article :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :source-highlighter: rouge :experimental: :images-path: articles/geom-class/ ifdef::env-beastie[] ifdef::backend-html5[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] :imagesdir: ../../../images/{images-path} endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [.abstract-title] Abstract This text documents some starting points in developing GEOM classes, and kernel modules in general. It is assumed that the reader is familiar with C userland programming. ''' toc::[] [[intro]] == Introduction [[intro-docs]] === Documentation Documentation on kernel programming is scarce - it is one of few areas where there is nearly nothing in the way of friendly tutorials, and the phrase "use the source!" really holds true. However, there are some bits and pieces (some of them seriously outdated) floating around that should be studied before beginning to code: * The extref:{developers-handbook}[FreeBSD Developer's Handbook] - part of the documentation project, it does not contain anything specific to kernel programming, but rather some general useful information. * The extref:{arch-handbook}[FreeBSD Architecture Handbook] - also from the documentation project, contains descriptions of several low-level facilities and procedures. The most important chapter is 13, extref:{arch-handbook}[Writing FreeBSD device drivers, driverbasics]. * The Blueprints section of http://www.freebsddiary.org[FreeBSD Diary] web site - contains several interesting articles on kernel facilities. * The man pages in section 9 - for important documentation on kernel functions. * The man:geom[4] man page and http://phk.freebsd.dk/pubs/[PHK's GEOM slides] - for general introduction of the GEOM subsystem. * Man pages man:g_bio[9], man:g_event[9], man:g_data[9], man:g_geom[9], man:g_provider[9], man:g_consumer[9], man:g_access[9] & others linked from those, for documentation on specific functionalities. * The man:style[9] man page - for documentation on the coding-style conventions which must be followed for any code which is to be committed to the FreeBSD tree. [[prelim]] == Preliminaries The best way to do kernel development is to have (at least) two separate computers. One of these would contain the development environment and sources, and the other would be used to test the newly written code by network-booting and network-mounting filesystems from the first one. This way if the new code contains bugs and crashes the machine, it will not mess up the sources (and other "live" data). The second system does not even require a proper display. Instead, it could be connected with a serial cable or KVM to the first one. But, since not everybody has two or more computers handy, there are a few things that can be done to prepare an otherwise "live" system for developing kernel code. This setup is also applicable for developing in a http://www.vmware.com/[VMWare] or http://www.qemu.org/[QEmu] virtual machine (the next best thing after a dedicated development machine). [[prelim-system]] === Modifying a System for Development For any kernel programming a kernel with `INVARIANTS` enabled is a must-have. So enter these in your kernel configuration file: [.programlisting] .... options INVARIANT_SUPPORT options INVARIANTS .... For more debugging you should also include WITNESS support, which will alert you of mistakes in locking: [.programlisting] .... options WITNESS_SUPPORT options WITNESS .... For debugging crash dumps, a kernel with debug symbols is needed: [.programlisting] .... makeoptions DEBUG=-g .... With the usual way of installing the kernel (`make installkernel`) the debug kernel will not be automatically installed. It is called [.filename]#kernel.debug# and located in [.filename]#/usr/obj/usr/src/sys/KERNELNAME/#. For convenience it should be copied to [.filename]#/boot/kernel/#. Another convenience is enabling the kernel debugger so you can examine a kernel panic when it happens. For this, enter the following lines in your kernel configuration file: [.programlisting] .... options KDB options DDB options KDB_TRACE .... For this to work you might need to set a sysctl (if it is not on by default): [.programlisting] .... debug.debugger_on_panic=1 .... Kernel panics will happen, so care should be taken with the filesystem cache. In particular, having softupdates might mean the latest file version could be lost if a panic occurs before it is committed to storage. Disabling softupdates yields a great performance hit, and still does not guarantee data consistency. Mounting filesystem with the "sync" option is needed for that. For a compromise, the softupdates cache delays can be shortened. There are three sysctl's that are useful for this (best to be set in [.filename]#/etc/sysctl.conf#): [.programlisting] .... kern.filedelay=5 kern.dirdelay=4 kern.metadelay=3 .... The numbers represent seconds. For debugging kernel panics, kernel core dumps are required. Since a kernel panic might make filesystems unusable, this crash dump is first written to a raw partition. Usually, this is the swap partition. This partition must be at least as large as the physical RAM in the machine. On the next boot, the dump is copied to a regular file. This happens after filesystems are checked and mounted, and before swap is enabled. This is controlled with two [.filename]#/etc/rc.conf# variables: [.programlisting] .... dumpdev="/dev/ad0s4b" -dumpdir="/usr/core +dumpdir="/usr/core" .... The `dumpdev` variable specifies the swap partition and `dumpdir` tells the system where in the filesystem to relocate the core dump on reboot. Writing kernel core dumps is slow and takes a long time so if you have lots of memory (>256M) and lots of panics it could be frustrating to sit and wait while it is done (twice - first to write it to swap, then to relocate it to filesystem). It is convenient then to limit the amount of RAM the system will use via a [.filename]#/boot/loader.conf# tunable: [.programlisting] .... hw.physmem="256M" .... If the panics are frequent and filesystems large (or you simply do not trust softupdates+background fsck) it is advisable to turn background fsck off via [.filename]#/etc/rc.conf# variable: [.programlisting] .... background_fsck="NO" .... This way, the filesystems will always get checked when needed. Note that with background fsck, a new panic could happen while it is checking the disks. Again, the safest way is not to have many local filesystems by using another computer as an NFS server. [[prelim-starting]] === Starting the Project For the purpose of creating a new GEOM class, an empty subdirectory has to be created under an arbitrary user-accessible directory. You do not have to create the module directory under [.filename]#/usr/src#. [[prelim-makefile]] === The Makefile It is good practice to create [.filename]#Makefiles# for every nontrivial coding project, which of course includes kernel modules. Creating the [.filename]#Makefile# is simple thanks to an extensive set of helper routines provided by the system. In short, here is how a minimal [.filename]#Makefile# looks for a kernel module: [.programlisting] .... SRCS=g_journal.c KMOD=geom_journal .include .... This [.filename]#Makefile# (with changed filenames) will do for any kernel module, and a GEOM class can reside in just one kernel module. If more than one file is required, list it in the `SRCS` variable, separated with whitespace from other filenames. [[kernelprog]] == On FreeBSD Kernel Programming [[kernelprog-memalloc]] === Memory Allocation See man:malloc[9]. Basic memory allocation is only slightly different than its userland equivalent. Most notably, `malloc`() and `free`() accept additional parameters as is described in the man page. A "malloc type" must be declared in the declaration section of a source file, like this: [.programlisting] .... static MALLOC_DEFINE(M_GJOURNAL, "gjournal data", "GEOM_JOURNAL Data"); .... To use this macro, [.filename]#sys/param.h#, [.filename]#sys/kernel.h# and [.filename]#sys/malloc.h# headers must be included. There is another mechanism for allocating memory, the UMA (Universal Memory Allocator). See man:uma[9] for details, but it is a special type of allocator mainly used for speedy allocation of lists comprised of same-sized items (for example, dynamic arrays of structs). [[kernelprog-lists]] === Lists and Queues See man:queue[3]. There are a LOT of cases when a list of things needs to be maintained. Fortunately, this data structure is implemented (in several ways) by C macros included in the system. The most used list type is TAILQ because it is the most flexible. It is also the one with largest memory requirements (its elements are doubly-linked) and also the slowest (although the speed variation is on the order of several CPU instructions more, so it should not be taken seriously). If data retrieval speed is very important, see man:tree[3] and man:hashinit[9]. [[kernelprog-bios]] === BIOs Structure `bio` is used for any and all Input/Output operations concerning GEOM. It basically contains information about what device ('provider') should satisfy the request, request type, offset, length, pointer to a buffer, and a bunch of "user-specific" flags and fields that can help implement various hacks. The important thing here is that ``bio``s are handled asynchronously. -That means that, in most parts of the code, there is no analogue to userland's man:read[2] and man:write[2] calls that do not return until a request is done. +That means that, in most parts of the code, there is no analogue to userland's man:read[2] and man:write[2] calls that do not return until a request is done. Rather, a developer-supplied function is called as a notification when the request gets completed (or results in error). The asynchronous programming model (also called "event-driven") is somewhat harder than the much more used imperative one used in userland (at least it takes a while to get used to it). In some cases the helper routines `g_write_data`() and `g_read_data`() can be used, but __not always__. In particular, they cannot be used when a mutex is held; for example, the GEOM topology mutex or the internal mutex held during the `.start`() and `.stop`() functions. [[geom]] == On GEOM Programming [[geom-ggate]] === Ggate If maximum performance is not needed, a much simpler way of making a data transformation is to implement it in userland via the ggate (GEOM gate) facility. Unfortunately, there is no easy way to convert between, or even share code between the two approaches. [[geom-class]] === GEOM Class GEOM classes are transformations on the data. These transformations can be combined in a tree-like fashion. Instances of GEOM classes are called __geoms__. Each GEOM class has several "class methods" that get called when there is no geom instance available (or they are simply not bound to a single instance): * `.init` is called when GEOM becomes aware of a GEOM class (when the kernel module gets loaded.) * `.fini` gets called when GEOM abandons the class (when the module gets unloaded) * `.taste` is called next, once for each provider the system has available. If applicable, this function will usually create and start a geom instance. * `.destroy_geom` is called when the geom should be disbanded * `.ctlconf` is called when user requests reconfiguration of existing geom Also defined are the GEOM event functions, which will get copied to the geom instance. Field `.geom` in the `g_class` structure is a LIST of geoms instantiated from the class. These functions are called from the g_event kernel thread. [[geom-softc]] === Softc The name "softc" is a legacy term for "driver private data". The name most probably comes from the archaic term "software control block". In GEOM, it is a structure (more precise: pointer to a structure) that can be attached to a geom instance to hold whatever data is private to the geom instance. Most GEOM classes have the following members: * `struct g_provider *provider` : The "provider" this geom instantiates * `uint16_t n_disks` : Number of consumer this geom consumes * `struct g_consumer \**disks` : Array of `struct g_consumer*`. (It is not possible to use just single indirection because struct g_consumer* are created on our behalf by GEOM). The `softc` structure contains all the state of geom instance. Every geom instance has its own softc. [[geom-metadata]] === Metadata Format of metadata is more-or-less class-dependent, but MUST start with: * 16 byte buffer for null-terminated signature (usually the class name) * uint32 version ID It is assumed that geom classes know how to handle metadata with version ID's lower than theirs. Metadata is located in the last sector of the provider (and thus must fit in it). (All this is implementation-dependent but all existing code works like that, and it is supported by libraries.) [[geom-creating]] === Labeling/creating a GEOM The sequence of events is: * user calls man:geom[8] utility (or one of its hardlinked friends) * the utility figures out which geom class it is supposed to handle and searches for [.filename]#geom_CLASSNAME.so# library (usually in [.filename]#/lib/geom#). * it man:dlopen[3]-s the library, extracts the definitions of command-line parameters and helper functions. In the case of creating/labeling a new geom, this is what happens: * man:geom[8] looks in the command-line argument for the command (usually `label`), and calls a helper function. * The helper function checks parameters and gathers metadata, which it proceeds to write to all concerned providers. * This "spoils" existing geoms (if any) and initializes a new round of "tasting" of the providers. The intended geom class recognizes the metadata and brings the geom up. (The above sequence of events is implementation-dependent but all existing code works like that, and it is supported by libraries.) [[geom-command]] === GEOM Command Structure The helper [.filename]#geom_CLASSNAME.so# library exports `class_commands` structure, which is an array of `struct g_command` elements. Commands are of uniform format and look like: [.programlisting] .... verb [-options] geomname [other] .... Common verbs are: * label - to write metadata to devices so they can be recognized at tasting and brought up in geoms * destroy - to destroy metadata, so the geoms get destroyed Common options are: * `-v` : be verbose * `-f` : force Many actions, such as labeling and destroying metadata can be performed in userland. For this, `struct g_command` provides field `gc_func` that can be set to a function (in the same [.filename]#.so#) that will be called to process a verb. If `gc_func` is NULL, the command will be passed to kernel module, to `.ctlreq` function of the geom class. [[geom-geoms]] === Geoms Geoms are instances of GEOM classes. They have internal data (a softc structure) and some functions with which they respond to external events. The event functions are: * `.access` : calculates permissions (read/write/exclusive) * `.dumpconf` : returns XML-formatted information about the geom * `.orphan` : called when some underlying provider gets disconnected * `.spoiled` : called when some underlying provider gets written to * `.start` : handles I/O These functions are called from the `g_down` kernel thread and there can be no sleeping in this context, (see definition of sleeping elsewhere) which limits what can be done quite a bit, but forces the handling to be fast. Of these, the most important function for doing actual useful work is the `.start`() function, which is called when a BIO request arrives for a provider managed by a instance of geom class. [[geom-threads]] === GEOM Threads There are three kernel threads created and run by the GEOM framework: * `g_down` : Handles requests coming from high-level entities (such as a userland request) on the way to physical devices * `g_up` : Handles responses from device drivers to requests made by higher-level entities * `g_event` : Handles all other cases: creation of geom instances, access counting, "spoil" events, etc. When a user process issues "read data X at offset Y of a file" request, this is what happens: * The filesystem converts the request into a struct bio instance and passes it to the GEOM subsystem. It knows what geom instance should handle it because filesystems are hosted directly on a geom instance. * The request ends up as a call to the `.start`() function made on the g_down thread and reaches the top-level geom instance. * This top-level geom instance (for example the partition slicer) determines that the request should be routed to a lower-level instance (for example the disk driver). It makes a copy of the bio request (bio requests _ALWAYS_ need to be copied between instances, with `g_clone_bio`()!), modifies the data offset and target provider fields and executes the copy with `g_io_request`() * The disk driver gets the bio request also as a call to `.start`() on the `g_down` thread. It talks to hardware, gets the data back, and calls `g_io_deliver`() on the bio. * Now, the notification of bio completion "bubbles up" in the `g_up` thread. First the partition slicer gets `.done`() called in the `g_up` thread, it uses information stored in the bio to free the cloned `bio` structure (with `g_destroy_bio`()) and calls `g_io_deliver`() on the original request. * The filesystem gets the data and transfers it to userland. See man:g_bio[9] man page for information how the data is passed back and forth in the `bio` structure (note in particular the `bio_parent` and `bio_children` fields and how they are handled). One important feature is: __THERE CAN BE NO SLEEPING IN G_UP AND G_DOWN THREADS__. This means that none of the following things can be done in those threads (the list is of course not complete, but only informative): * Calls to `msleep`() and `tsleep`(), obviously. * Calls to `g_write_data`() and `g_read_data`(), because these sleep between passing the data to consumers and returning. * Waiting for I/O. * Calls to man:malloc[9] and `uma_zalloc`() with `M_WAITOK` flag set * sx and other sleepable locks This restriction is here to stop GEOM code clogging the I/O request path, since sleeping is usually not time-bound and there can be no guarantees on how long will it take (there are some other, more technical reasons also). It also means that there is not much that can be done in those threads; for example, almost any complex thing requires memory allocation. Fortunately, there is a way out: creating additional kernel threads. [[geom-kernelthreads]] === Kernel Threads for Use in GEOM Code Kernel threads are created with man:kthread_create[9] function, and they are sort of similar to userland threads in behavior, only they cannot return to caller to signify termination, but must call man:kthread_exit[9]. In GEOM code, the usual use of threads is to offload processing of requests from `g_down` thread (the `.start`() function). These threads look like "event handlers": they have a linked list of event associated with them (on which events can be posted by various functions in various threads so it must be protected by a mutex), take the events from the list one by one and process them in a big `switch`() statement. The main benefit of using a thread to handle I/O requests is that it can sleep when needed. Now, this sounds good, but should be carefully thought out. Sleeping is well and very convenient but can very effectively destroy performance of the geom transformation. Extremely performance-sensitive classes probably should do all the work in `.start`() function call, taking great care to handle out-of-memory and similar errors. The other benefit of having a event-handler thread like that is to serialize all the requests and responses coming from different geom threads into one thread. This is also very convenient but can be slow. In most cases, handling of `.done`() requests can be left to the `g_up` thread. Mutexes in FreeBSD kernel (see man:mutex[9]) have one distinction from their more common userland cousins - the code cannot sleep while holding a mutex). If the code needs to sleep a lot, man:sx[9] locks may be more appropriate. On the other hand, if you do almost everything in a single thread, you may get away with no mutexes at all. diff --git a/documentation/content/en/articles/mailing-list-faq/_index.adoc b/documentation/content/en/articles/mailing-list-faq/_index.adoc index ab708fc0e4..64595b6bb0 100644 --- a/documentation/content/en/articles/mailing-list-faq/_index.adoc +++ b/documentation/content/en/articles/mailing-list-faq/_index.adoc @@ -1,211 +1,211 @@ --- title: Frequently Asked Questions About The FreeBSD Mailing Lists authors: - author: The FreeBSD Documentation Project copyright: 2004-2021 The FreeBSD Documentation Project description: How to best use the mailing lists, such as how to help avoid frequently-repeated discussions tags: ["FAQ", "Mailing Lists", "FreeBSD"] --- = Frequently Asked Questions About The FreeBSD Mailing Lists :doctype: article :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :source-highlighter: rouge :experimental: :images-path: articles/mailing-list-faq/ ifdef::env-beastie[] ifdef::backend-html5[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] :imagesdir: ../../../images/{images-path} endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [.abstract-title] Abstract This is the FAQ for the FreeBSD mailing lists. If you are interested in helping with this project, send email to the {freebsd-doc}. -The latest version of this document is always available from the link:.[FreeBSD World Wide Web server]. -It may also be downloaded as one large link:.[HTML] file with HTTP or as plain text, PostScript, PDF, etc. from the https://download.freebsd.org/doc/[FreeBSD FTP server]. +The latest version of this document is always available from the extref:{mailing-list-faq}[FreeBSD World Wide Web server]. +It may also be downloaded as one large extref:{mailing-list-faq}[HTML] file with HTTP or as plain text, PostScript, PDF, etc. from the https://download.freebsd.org/doc/[FreeBSD FTP server]. You may also want to link:https://www.FreeBSD.org/search/[Search the FAQ]. ''' toc::[] [[introduction]] == Introduction As is usual with FAQs, this document aims to cover the most frequently asked questions concerning the FreeBSD mailing lists (and of course answer them!). Although originally intended to reduce bandwidth and avoid the same old questions being asked over and over again, FAQs have become recognized as valuable information resources. This document attempts to represent a community consensus, and as such it can never really be __authoritative__. However, if you find technical errors within this document, or have suggestions about items that should be added, please either submit a PR, or email the {freebsd-doc}. Thanks. === What is the purpose of the FreeBSD mailing lists? The FreeBSD mailing lists serve as the primary communication channels for the FreeBSD community, covering many different topic areas and communities of interest. === Who is the audience for the FreeBSD mailing lists? This depends on charter of each individual list. Some lists are more oriented to developers; some are more oriented towards the FreeBSD community as a whole. Please see link:https://lists.FreeBSD.org/[this list] for the current summary. Lists are English language, unless stated otherwise. === Are the FreeBSD mailing lists open for anyone to participate? Again, this depends on charter of each individual list. Please read the charter of a mailing list before you post to it, and respect it when you post. This will help everyone to have a better experience with the lists. If after reading the above lists, you still do not know which mailing list to post a question to, you will probably want to post to freebsd-questions (but see below, first). Note that you must subscribe to a mailing list before you can post. You can elect to subscribe without receiving messages posted to the mailing list. === How can I subscribe? You can use link:https://lists.FreeBSD.org/[the Mlmmj web interface] to subscribe to any of the public lists. === How can I unsubscribe? You can use the same interface as above; or, you can follow the instructions that are at the bottom of every mailing list message that is sent. Please do not send unsubscribe messages directly to the public lists themselves. First, this will not accomplish your goal, and second, it will irritate the existing subscribers, and you will probably get flamed. This is a classical mistake when using mailing lists; please try to avoid it. === Are archives available? Yes. Threaded archives with all e-mails since 1994 are available link:https://mail-archive.freebsd.org/mail/[here]. You can also access https://lists.freebsd.org/pipermail[mailman archive] and link:https://lists.freebsd.org/archives[mlmmj archive] directly. === Are mailing lists available in a digest format? Yes. See link:https://lists.FreeBSD.org/[the Mlmmj web interface]. [[etiquette]] == Mailing List Etiquette Participation in the mailing lists, like participation in any community, requires a common basis for communication. Please make only appropriate postings, and follow common rules of etiquette. === What should I do before I post? You have already taken the most important step by reading this document. However, if you are new to FreeBSD, you may first need to familiarize yourself with the software, and all the social history around it, by reading the numerous link:https://www.FreeBSD.org/docs/books/[books and articles] that are available. Items of particular interest include the extref:{faq}[FreeBSD Frequently Asked Questions (FAQ)] document, the extref:{handbook}[FreeBSD Handbook], and the articles extref:{freebsd-questions-article}[How to get best results from the FreeBSD-questions mailing list], extref:{explaining-bsd}[Explaining BSD], and extref:{new-users}[FreeBSD First Steps]. It is always considered bad form to ask a question that is already answered in the above documents. This is not because the volunteers who work on this project are particularly mean people, but after a certain number of times answering the same questions over and over again, frustration begins to set in. This is particularly true if there is an existing answer to the question that is already available. Always keep in mind that almost all of the work done on FreeBSD is done by volunteers, and that we are only human. === What constitutes an inappropriate posting? * Postings must be in accordance with the charter of the mailing list. * Personal attacks are discouraged. As good net-citizens, we should try to hold ourselves to high standards of behavior. * Spam is not allowed, ever. The mailing lists are actively processed to ban offenders to this rule. === What is considered proper etiquette when posting to the mailing lists? * Please wrap lines at 75 characters, since not everyone uses fancy GUI mail reading programs. * Please respect the fact that bandwidth is not infinite. Not everyone reads email through high-speed connections, so if your posting involves something like the content of [.filename]#config.log# or an extensive stack trace, please consider putting that information up on a website somewhere and just provide a URL to it. Remember, too, that these postings will be archived indefinitely, so huge postings will simply inflate the size of the archives long after their purpose has expired. * Format your message so that it is legible, and PLEASE DO NOT SHOUT!!!!!. Do not underestimate the effect that a poorly formatted mail message has, and not just on the FreeBSD mailing lists. Your mail message is all that people see of you, and if it is poorly formatted, badly spelled, full of errors, and/or has lots of exclamation points, it will give people a poor impression of you. * Please use an appropriate human language for a particular mailing list. Many non-English mailing lists are link:https://www.FreeBSD.org/community/mailinglists/[available]. + For the ones that are not, we do appreciate that many people do not speak English as their first language, and we try to make allowances for that. It is considered particularly poor form to criticize non-native speakers for spelling or grammatical errors. FreeBSD has an excellent track record in this regard; please, help us to uphold that tradition. * Please use a standards-compliant Mail User Agent (MUA). A lot of badly formatted messages come from http://www.lemis.com/grog/email/email.php[bad mailers or badly configured mailers]. The following mailers are known to send out badly formatted messages without you finding out about them: ** exmh ** Microsoft(R) Exchange ** Microsoft(R) Outlook(R) + Try not to use MIME: a lot of people use mailers which do not get on very well with MIME. * Make sure your time and time zone are set correctly. This may seem a little silly, since your message still gets there, but many of the people on these mailing lists get several hundred messages a day. They frequently sort the incoming messages by subject and by date, and if your message does not come before the first answer, they may assume that they missed it and not bother to look. * A lot of the information you need to supply is the output of programs, such as man:dmesg[8], or console messages, which usually appear in [.filename]#/var/log/messages#. Do not try to copy this information by typing it in again; not only it is a real pain, but you are bound to make a mistake. To send log file contents, either make a copy of the file and use an editor to trim the information to what is relevant, or cut and paste into your message. For the output of programs like `dmesg`, redirect the output to a file and include that. For example, + [source,shell] .... % dmesg > /tmp/dmesg.out .... + This redirects the information to the file [.filename]#/tmp/dmesg.out#. * When using cut-and-paste, please be aware that some such operations badly mangle their messages. This is of particular concern when posting contents of [.filename]#Makefiles#, where `tab` is a significant character. This is a very common, and very annoying, problem with submissions to the link:https://www.FreeBSD.org/support/[Problem Reports database]. [.filename]#Makefiles# with tabs changed to either spaces, or the annoying `=3B` escape sequence, create a great deal of aggravation for committers. === What are the special etiquette consideration when replying to an existing posting on the mailing lists? * Please include relevant text from the original message. Trim it to the minimum, but do not overdo it. It should still be possible for somebody who did not read the original message to understand what you are talking about. + This is especially important for postings of the type "yes, I see this too", where the initial posting was dozens or hundreds of lines. * Use some technique to identify which text came from the original message, and which text you add. A common convention is to prepend "`>`" to the original message. Leaving white space after the "`>`" and leaving empty lines between your text and the original text both make the result more readable. * Please ensure that the attributions of the text you are quoting is correct. People can become offended if you attribute words to them that they themselves did not write. * Please do not `top post`. By this, we mean that if you are replying to a message, please put your replies after the text that you copy in your reply. + ** A: Because it reverses the logical flow of conversation. ** Q: Why is top posting frowned upon? + (Thanks to Randy Bush for the joke.) [[recurring]] == Recurring Topics On The Mailing Lists Participation in the mailing lists, like participation in any community, requires a common basis for communication. Many of the mailing lists presuppose a knowledge of the Project's history. In particular, there are certain topics that seem to regularly occur to newcomers to the community. It is the responsibility of each poster to ensure that their postings do not fall into one of these categories. By doing so, you will help the mailing lists to stay on-topic, and probably save yourself being flamed in the process. The best method to avoid this is to familiarize yourself with the http://docs.FreeBSD.org/mail/[mailing list archives], to help yourself understand the background of what has gone before. In this, the https://www.FreeBSD.org/search/#mailinglists[mailing list search interface] is invaluable. (If that method does not yield useful results, please supplement it with a search with your favorite major search engine). By familiarizing yourself with the archives, not only will you learn what topics have been discussed before, but also how discussion tends to proceed on that list, who the participants are, and who the target audience is. These are always good things to know before you post to any mailing list, not just a FreeBSD mailing list. There is no doubt that the archives are quite extensive, and some questions recur more often than others, sometimes as followups where the subject line no longer accurately reflects the new content. Nevertheless, the burden is on you, the poster, to do your homework to help avoid these recurring topics. [[bikeshed]] == What Is A "Bikeshed"? Literally, a `bikeshed` is a small outdoor shelter into which one may store one's two-wheeled form of transportation. However, in FreeBSD parlance, the term refers to topics that are simple enough that (nearly) anyone can offer an opinion about, and often (nearly) everyone does. The genesis of this term is explained in more detail extref:{faq}[in this document, bikeshed-painting]. You simply must have a working knowledge of this concept before posting to any FreeBSD mailing list. More generally, a bikeshed is a topic that will tend to generate immediate meta-discussions and flames if you have not read up on their history. Please help us to keep the mailing lists as useful for as many people as possible by avoiding bikesheds whenever you can. Thanks. [[acknowledgments]] == Acknowledgments `{grog}`:: Original author of most of the material on mailing list etiquette, taken from the article on extref:{freebsd-questions-article}[How to get best results from the FreeBSD-questions mailing list]. `{linimon}`:: Creation of the rough draft of this FAQ. diff --git a/documentation/content/en/articles/pr-guidelines/_index.adoc b/documentation/content/en/articles/pr-guidelines/_index.adoc index b6729150cd..d645d4637b 100644 --- a/documentation/content/en/articles/pr-guidelines/_index.adoc +++ b/documentation/content/en/articles/pr-guidelines/_index.adoc @@ -1,515 +1,515 @@ --- title: Problem Report Handling Guidelines authors: - author: Dag-Erling Smørgrav - author: Hiten Pandya description: These guidelines describe recommended handling practices for FreeBSD Problem Reports (PRs). trademarks: ["freebsd", "general"] tags: ["PR", "guideline", "bugs", "maintenance", "BugZilla", "FreeBSD"] --- = Problem Report Handling Guidelines :doctype: article :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :source-highlighter: rouge :experimental: :images-path: articles/pr-guidelines/ ifdef::env-beastie[] ifdef::backend-html5[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] :imagesdir: ../../../images/{images-path} endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [.abstract-title] Abstract These guidelines describe recommended handling practices for FreeBSD Problem Reports (PRs). Whilst developed for the FreeBSD PR Database Maintenance Team mailto:freebsd-bugbusters@FreeBSD.org[freebsd-bugbusters@FreeBSD.org], these guidelines should be followed by anyone working with FreeBSD PRs. ''' toc::[] [[intro]] == Introduction Bugzilla is an issue management system used by the FreeBSD Project. As accurate tracking of outstanding software defects is important to FreeBSD's quality, the correct use of the software is essential to the forward progress of the Project. Access to Bugzilla is available to the entire FreeBSD community. In order to maintain consistency within the database and provide a consistent user experience, guidelines have been established covering common aspects of bug management such as presenting followup, handling close requests, and so forth. [[pr-lifecycle]] == Problem Report Life-cycle * The Reporter submits a bug report on the website. The bug is in the `Needs Triage` state. * Jane Random BugBuster confirms that the bug report has sufficient information to be reproducible. If not, she goes back and forth with the reporter to obtain the needed information. At this point the bug is set to the `Open` state. * Joe Random Committer takes interest in the PR and assigns it to himself, or Jane Random BugBuster decides that Joe is best suited to handle it and assigns it to him. The bug should be set to the `In Discussion` state. * Joe has a brief exchange with the originator (making sure it all goes into the audit trail) and determines the cause of the problem. * Joe pulls an all-nighter and whips up a patch that he thinks fixes the problem, and submits it in a follow-up, asking the originator to test it. He then sets the PRs state to `Patch Ready`. * A couple of iterations later, both Joe and the originator are satisfied with the patch, and Joe commits it to `-CURRENT` (or directly to `-STABLE` if the problem does not exist in `-CURRENT`), making sure to reference the Problem Report in his commit log (and credit the originator if they submitted all or part of the patch) and, if appropriate, start an MFC countdown. The bug is set to the `Needs MFC` state. * If the patch does not need MFCing, Joe then closes the PR as `Issue Resolved`. [NOTE] ==== Many PRs are submitted with very little information about the problem, and some are either very complex to solve, or just scratch the surface of a larger problem; in these cases, it is very important to obtain all the necessary information needed to solve the problem. If the problem contained within cannot be solved, or has occurred again, it is necessary to re-open the PR. ==== [[pr-states]] == Problem Report State It is important to update the state of a PR when certain actions are taken. The state should accurately reflect the current state of work on the PR. .A small example on when to change PR state [example] ==== When a PR has been worked on and the developer(s) responsible feel comfortable about the fix, they will submit a followup to the PR and change its state to "feedback". At this point, the originator should evaluate the fix in their context and respond indicating whether the defect has indeed been remedied. ==== A Problem Report may be in one of the following states: open:: Initial state; the problem has been pointed out and it needs reviewing. analyzed:: The problem has been reviewed and a solution is being sought. feedback:: Further work requires additional information from the originator or the community; possibly information regarding the proposed solution. patched:: A patch has been committed, but something (MFC, or maybe confirmation from originator) is still pending. suspended:: The problem is not being worked on, due to lack of information or resources. This is a prime candidate for somebody who is looking for a project to take on. If the problem cannot be solved at all, it will be closed, rather than suspended. The documentation project uses suspended for wish-list items that entail a significant amount of work which no one currently has time for. closed:: A problem report is closed when any changes have been integrated, documented, and tested, or when fixing the problem is abandoned. [NOTE] ==== The "patched" state is directly related to feedback, so you may go directly to "closed" state if the originator cannot test the patch, and it works in your own testing. ==== [[pr-types]] == Types of Problem Reports While handling problem reports, either as a developer who has direct access to the Problem Reports database or as a contributor who browses the database and submits followups with patches, comments, suggestions or change requests, you will come across several different types of PRs. * crossref:pr-guidelines[pr-unassigned, Unassigned PRs] * crossref:pr-guidelines[pr-assigned, Assigned PRs] * crossref:pr-guidelines[pr-dups, Duplicate PRs] * crossref:pr-guidelines[pr-stale, Stale PRs] * crossref:pr-guidelines[pr-misfiled-notpr, Non-Bug PRs] The following sections describe what each different type of PRs is used for, when a PR belongs to one of these types, and what treatment each different type receives. [[pr-unassigned]] == Unassigned PRs When PRs arrive, they are initially assigned to a generic (placeholder) assignee. These are always prepended with `freebsd-`. The exact value for this default depends on the category; in most cases, it corresponds to a specific FreeBSD mailing list. Here is the current list, with the most common ones listed first: [[default-assignees-common]] .Default Assignees - most common [cols="1,1,1", options="header"] |=== | Type | Categories | Default Assignee |base system |bin, conf, gnu, kern, misc |freebsd-bugs |architecture-specific |alpha, amd64, arm, i386, ia64, powerpc, sparc64 |freebsd-_arch_ |ports collection |ports |freebsd-ports-bugs |documentation shipped with the system |docs |freebsd-doc |FreeBSD web pages (not including docs) |Website |freebsd-www |=== [[default-assignees-other]] .Default Assignees - other [cols="1,1,1", options="header"] |=== | Type | Categories | Default Assignee |advocacy efforts |advocacy |freebsd-advocacy |Java Virtual Machine(TM) problems |java |freebsd-java |standards compliance |standards |freebsd-standards |threading libraries |threads |freebsd-threads |man:usb[4] subsystem |usb |freebsd-usb |=== Do not be surprised to find that the submitter of the PR has assigned it to the wrong category. If you fix the category, do not forget to fix the assignment as well. (In particular, our submitters seem to have a hard time understanding that just because their problem manifested on an i386 system, that it might be generic to all of FreeBSD, and thus be more appropriate for `kern`. The converse is also true, of course.) Certain PRs may be reassigned away from these generic assignees by anyone. There are several types of assignees: specialized mailing lists; mail aliases (used for certain limited-interest items); and individuals. For assignees which are mailing lists, please use the long form when making the assignment (e.g., `freebsd-foo` instead of `foo`); this will avoid duplicate emails sent to the mailing list. [NOTE] ==== -Since the list of individuals who have volunteered to be the default assignee for certain types of PRs changes so often, it is much more suitable for https://wiki.freebsd.org/AssigningPRs[the FreeBSD wiki]. +Since the list of individuals who have volunteered to be the default assignee for certain types of PRs changes so often, it is much more suitable for https://wiki.freebsd.org/AssigningPRs[the FreeBSD wiki]. ==== Here is a sample list of such entities; it is probably not complete. [[common-assignees-base]] .Common Assignees - base system [cols="1,1,1,1", options="header"] |=== | Type | Suggested Category | Suggested Assignee | Assignee Type |problem specific to the ARM(R) architecture |arm |freebsd-arm |mailing list |problem specific to the MIPS(R) architecture |kern |freebsd-mips |mailing list |problem specific to the PowerPC(R) architecture |kern |freebsd-ppc |mailing list |problem with Advanced Configuration and Power Management (man:acpi[4]) |kern |freebsd-acpi |mailing list |problem with Asynchronous Transfer Mode (ATM) drivers |kern |freebsd-atm |mailing list |problem with embedded or small-footprint FreeBSD systems (e.g., NanoBSD/PicoBSD/FreeBSD-arm) |kern |freebsd-embedded |mailing list |problem with FireWire(R) drivers |kern |freebsd-firewire |mailing list |problem with the filesystem code |kern |freebsd-fs |mailing list |problem with the man:geom[4] subsystem |kern |freebsd-geom |mailing list |problem with the man:ipfw[4] subsystem |kern |freebsd-ipfw |mailing list |problem with Integrated Services Digital Network (ISDN) drivers |kern |freebsd-isdn |mailing list |man:jail[8] subsystem |kern |freebsd-jail |mailing list |problem with Linux(R) or SVR4 emulation |kern |freebsd-emulation |mailing list |problem with the networking stack |kern |freebsd-net |mailing list |problem with the man:pf[4] subsystem |kern |freebsd-pf |mailing list |problem with the man:scsi[4] subsystem |kern |freebsd-scsi |mailing list |problem with the man:sound[4] subsystem |kern |freebsd-multimedia |mailing list |problems with the man:wlan[4] subsystem and wireless drivers |kern |freebsd-wireless |mailing list |problem with man:sysinstall[8] or man:bsdinstall[8] |bin |freebsd-sysinstall |mailing list |problem with the system startup scripts (man:rc[8]) |kern |freebsd-rc |mailing list |problem with VIMAGE or VNET functionality and related code |kern |freebsd-virtualization |mailing list |problem with Xen emulation |kern |freebsd-xen |mailing list |=== [[common-assignees-ports]] .Common Assignees - Ports Collection [cols="1,1,1,1", options="header"] |=== | Type | Suggested Category | Suggested Assignee | Assignee Type |problem with the ports framework (__not__ with an individual port!) |ports |portmgr |alias |port which is maintained by apache@FreeBSD.org |ports |apache |mailing list |port which is maintained by autotools@FreeBSD.org |ports |autotools |alias |port which is maintained by doceng@FreeBSD.org |ports |doceng |alias |port which is maintained by eclipse@FreeBSD.org |ports |freebsd-eclipse |mailing list |port which is maintained by gecko@FreeBSD.org |ports |gecko |mailing list |port which is maintained by gnome@FreeBSD.org |ports |gnome |mailing list |port which is maintained by hamradio@FreeBSD.org |ports |hamradio |alias |port which is maintained by haskell@FreeBSD.org |ports |haskell |alias |port which is maintained by java@FreeBSD.org |ports |freebsd-java |mailing list |port which is maintained by kde@FreeBSD.org |ports |kde |mailing list |port which is maintained by mono@FreeBSD.org |ports |mono |mailing list |port which is maintained by office@FreeBSD.org |ports |freebsd-office |mailing list |port which is maintained by perl@FreeBSD.org |ports |perl |mailing list |port which is maintained by python@FreeBSD.org |ports |freebsd-python |mailing list |port which is maintained by ruby@FreeBSD.org |ports |freebsd-ruby |mailing list |port which is maintained by secteam@FreeBSD.org |ports |secteam |alias |port which is maintained by vbox@FreeBSD.org |ports |vbox |alias |port which is maintained by x11@FreeBSD.org |ports |freebsd-x11 |mailing list |=== -Ports PRs which have a maintainer who is a ports committer may be reassigned by anyone (but note that not every FreeBSD committer is necessarily a ports committer, so you cannot simply go by the email address alone.) +Ports PRs which have a maintainer who is a ports committer may be reassigned by anyone (but note that not every FreeBSD committer is necessarily a ports committer, so you cannot simply go by the email address alone.) For other PRs, please do not reassign them to individuals (other than yourself) unless you are certain that the assignee really wants to track the PR. This will help to avoid the case where no one looks at fixing a particular problem because everyone assumes that the assignee is already working on it. [[common-assignees-other]] .Common Assignees - Other [cols="1,1,1,1", options="header"] |=== | Type | Suggested Category | Suggested Assignee | Assignee Type |problem with PR database |bin |bugmeister |alias |problem with Bugzilla https://bugs.freebsd.org/submit/[web form]. |doc |bugmeister |alias |=== [[pr-assigned]] == Assigned PRs If a PR has the `responsible` field set to the username of a FreeBSD developer, it means that the PR has been handed over to that particular person for further work. Assigned PRs should not be touched by anyone but the assignee or bugmeister. If you have comments, submit a followup. If for some reason you think the PR should change state or be reassigned, send a message to the assignee. If the assignee does not respond within two weeks, unassign the PR and do as you please. [[pr-dups]] == Duplicate PRs If you find more than one PR that describe the same issue, choose the one that contains the largest amount of useful information and close the others, stating clearly the number of the superseding PR. If several PRs contain non-overlapping useful information, submit all the missing information to one in a followup, including references to the others; then close the other PRs (which are now completely superseded). [[pr-stale]] == Stale PRs A PR is considered stale if it has not been modified in more than six months. Apply the following procedure to deal with stale PRs: * If the PR contains sufficient detail, try to reproduce the problem in `-CURRENT` and `-STABLE`. If you succeed, submit a followup detailing your findings and try to find someone to assign it to. Set the state to "analyzed" if appropriate. * If the PR describes an issue which you know is the result of a usage error (incorrect configuration or otherwise), submit a followup explaining what the originator did wrong, then close the PR with the reason "User error" or "Configuration error". * If the PR describes an error which you know has been corrected in both `-CURRENT` and `-STABLE`, close it with a message stating when it was fixed in each branch. * If the PR describes an error which you know has been corrected in `-CURRENT`, but not in `-STABLE`, try to find out when the person who corrected it is planning to MFC it, or try to find someone else (maybe yourself?) to do it. Set the state to "patched" and assign it to whomever will do the MFC. * In other cases, ask the originator to confirm if the problem still exists in newer versions. If the originator does not reply within a month, close the PR with the notation "Feedback timeout". [[pr-misfiled-notpr]] == Non-Bug PRs Developers that come across PRs that look like they should have been posted to {freebsd-bugs} or some other list should close the PR, informing the submitter in a comment why this is not really a PR and where the message should be posted. The email addresses that Bugzilla listens to for incoming PRs have been published as part of the FreeBSD documentation, have been announced and listed on the web-site. This means that spammers found them. Whenever you close one of these PRs, please do the following: -* Set the component to `junk` (under `Supporting Services`. +* Set the component to `junk` (under `Supporting Services`). * Set Responsible to `nobody@FreeBSD.org`. * Set State to `Issue Resolved`. Setting the category to `junk` makes it obvious that there is no useful content within the PR, and helps to reduce the clutter within the main categories. [[references]] == Further Reading This is a list of resources relevant to the proper writing and processing of problem reports. It is by no means complete. * extref:{problem-reports}[How to Write FreeBSD Problem Reports]-guidelines for PR originators. diff --git a/documentation/content/en/books/arch-handbook/boot/_index.adoc b/documentation/content/en/books/arch-handbook/boot/_index.adoc index ef1d8932d1..8508f6f116 100644 --- a/documentation/content/en/books/arch-handbook/boot/_index.adoc +++ b/documentation/content/en/books/arch-handbook/boot/_index.adoc @@ -1,1702 +1,1702 @@ --- title: Chapter 1. Bootstrapping and Kernel Initialization prev: books/arch-handbook/parti next: books/arch-handbook/locking description: Bootstrapping and Kernel Initialization tags: ["boot", "BIOS", "kernel", "MBR", "FreeBSD"] showBookMenu: true weight: 2 params: path: "/books/arch-handbook/boot/" --- [[boot]] = Bootstrapping and Kernel Initialization :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 1 :partnums: :source-highlighter: rouge :experimental: :images-path: books/arch-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[boot-synopsis]] == Synopsis This chapter is an overview of the boot and system initialization processes, starting from the BIOS (firmware) POST, to the first user process creation. Since the initial steps of system startup are very architecture dependent, the IA-32 architecture is used as an example. But the AMD64 and ARM64 architectures are much more important and compelling examples and should be explained in the near future according to the topic of this document. The FreeBSD boot process can be surprisingly complex. After control is passed from the BIOS, a considerable amount of low-level configuration must be done before the kernel can be loaded and executed. This setup must be done in a simple and flexible manner, allowing the user a great deal of customization possibilities. [[boot-overview]] == Overview The boot process is an extremely machine-dependent activity. Not only must code be written for every computer architecture, but there may also be multiple types of booting on the same architecture. For example, a directory listing of [.filename]#stand# reveals a great amount of architecture-dependent code. There is a directory for each of the various supported architectures. FreeBSD supports the CSM boot standard (Compatibility Support Module). So CSM is supported (with both GPT and MBR partitioning support) and UEFI booting (GPT is totally supported, MBR is mostly supported). It also supports loading files from ext2fs, MSDOS, UFS and ZFS. FreeBSD also supports the boot environment feature of ZFS which allows the HOST OS to communicate details about what to boot that go beyond a simple partition as was possible in the past. But UEFI is more relevant than the CSM these days. The example that follows shows booting an x86 computer from an MBR-partitioned hard drive with the FreeBSD [.filename]#boot0# multi-boot loader stored in the very first sector. That boot code starts the FreeBSD three-stage boot process. The key to understanding this process is that it is a series of stages of increasing complexity. These stages are [.filename]#boot1#, [.filename]#boot2#, and [.filename]#loader# (see man:boot[8] for more detail). The boot system executes each stage in sequence. The last stage, [.filename]#loader#, is responsible for loading the FreeBSD kernel. Each stage is examined in the following sections. Here is an example of the output generated by the different boot stages. Actual output may differ from machine to machine: [.informaltable] [cols="20%,80%", frame="none"] |=== |*FreeBSD Component* |*Output (may vary)* |`boot0` a| [source,bash] .... F1 FreeBSD F2 BSD F5 Disk 2 .... |`boot2` footnote:[This prompt will appear if the user presses a key just after selecting an OS to boot at the boot0 stage.] a| [source,bash] .... >>FreeBSD/x86 BOOT Default: 0:ad(0p4)/boot/loader boot: .... |[.filename]#loader# a| [source,bash] .... BTX loader 1.00 BTX version is 1.02 Consoles: internal video/keyboard BIOS drive C: is disk0 BIOS 639kB/2096064kB available memory FreeBSD/x86 bootstrap loader, Revision 1.1 Console internal video/keyboard (root@releng1.nyi.freebsd.org, Fri Apr 9 04:04:45 UTC 2021) Loading /boot/defaults/loader.conf /boot/kernel/kernel text=0xed9008 data=0x117d28+0x176650 syms=[0x8+0x137988+0x8+0x1515f8] .... |kernel a| [source,bash] .... Copyright (c) 1992-2021 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 13.0-RELEASE 0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:04:45 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC i386 FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe) .... |=== [[boot-bios]] == The BIOS When the computer powers on, the processor's registers are set to some predefined values. One of the registers is the _instruction pointer_ register, and its value after a power on is well defined: it is a 32-bit value of `0xfffffff0`. The instruction pointer register (also known as the Program Counter) points to code to be executed by the processor. Another important register is the `cr0` 32-bit control register, and its value just after a reboot is `0`. One of ``cr0``'s bits, the PE (Protection Enabled) bit, indicates whether the processor is running in 32-bit protected mode or 16-bit real mode. Since this bit is cleared at boot time, the processor boots in 16-bit real mode. Real mode means, among other things, that linear and physical addresses are identical. The reason for the processor not to start immediately in 32-bit protected mode is backwards compatibility. In particular, the boot process relies on the services provided by the BIOS, and the BIOS itself works in legacy, 16-bit code. The value of `0xfffffff0` is slightly less than 4 GB, so unless the machine has 4 GB of physical memory, it cannot point to a valid memory address. The computer's hardware translates this address so that it points to a BIOS memory block. The BIOS (Basic Input Output System) is a chip on the motherboard that has a relatively small amount of read-only memory (ROM). This memory contains various low-level routines that are specific to the hardware supplied with the motherboard. The processor will first jump to the address 0xfffffff0, which really resides in the BIOS's memory. Usually this address contains a jump instruction to the BIOS's POST routines. The POST (Power On Self Test) is a set of routines including the memory check, system bus check, and other low-level initialization so the CPU can set up the computer properly. The important step of this stage is determining the boot device. Modern BIOS implementations permit the selection of a boot device, allowing booting from a floppy, CD-ROM, hard disk, or other devices. The very last thing in the POST is the `INT 0x19` instruction. The `INT 0x19` handler reads 512 bytes from the first sector of boot device into the memory at address `0x7c00`. The term _first sector_ originates from hard drive architecture, where the magnetic plate is divided into a number of cylindrical tracks. Tracks are numbered, and every track is divided into a number (usually 64) of sectors. Track numbers start at 0, but sector numbers start from 1. Track 0 is the outermost on the magnetic plate, and sector 1, the first sector, has a special purpose. It is also called the MBR, or Master Boot Record. The remaining sectors on the first track are never used. This sector is our boot-sequence starting point. As we will see, this sector contains a copy of our [.filename]#boot0# program. A jump is made by the BIOS to address `0x7c00` so it starts executing. [[boot-boot0]] == The Master Boot Record (`boot0`) After control is received from the BIOS at memory address `0x7c00`, [.filename]#boot0# starts executing. It is the first piece of code under FreeBSD control. The task of [.filename]#boot0# is quite simple: scan the partition table and let the user choose which partition to boot from. The Partition Table is a special, standard data structure embedded in the MBR (hence embedded in [.filename]#boot0#) describing the four standard PC "partitions". [.filename]#boot0# resides in the filesystem as [.filename]#/boot/boot0#. It is a small 512-byte file, and it is exactly what FreeBSD's installation procedure wrote to the hard disk's MBR if you chose the "bootmanager" option at installation time. Indeed, [.filename]#boot0# _is_ the MBR. As mentioned previously, we're calling the BIOS `INT 0x19` to load the MBR ([.filename]#boot0#) into memory at address `0x7c00`. The source file for [.filename]#boot0# can be found in [.filename]#stand/i386/boot0/boot0.S# - which is an awesome piece of code written by Robert Nordier. A special structure starting from offset `0x1be` in the MBR is called the _partition table_. It has four records of 16 bytes each, called _partition records_, which represent how the hard disk is partitioned, or, in FreeBSD's terminology, sliced. One byte of those 16 says whether a partition (slice) is bootable or not. Exactly one record must have that flag set, otherwise [.filename]#boot0#'s code will refuse to proceed. A partition record has the following fields: * the 1-byte filesystem type * the 1-byte bootable flag * the 6 byte descriptor in CHS format * the 8 byte descriptor in LBA format A partition record descriptor contains information about where exactly the partition resides on the drive. Both descriptors, LBA and CHS, describe the same information, but in different ways: LBA (Logical Block Addressing) has the starting sector for the partition and the partition's length, while CHS (Cylinder Head Sector) has coordinates for the first and last sectors of the partition. The partition table ends with the special signature `0xaa55`. The MBR must fit into 512 bytes, a single disk sector. This program uses low-level "tricks" like taking advantage of the side effects of certain instructions and reusing register values from previous operations to make the most out of the fewest possible instructions. Care must also be taken when handling the partition table, which is embedded in the MBR itself. For these reasons, be very careful when modifying [.filename]#boot0.S#. Note that the [.filename]#boot0.S# source file is assembled "as is": instructions are translated one by one to binary, with no additional information (no ELF file format, for example). This kind of low-level control is achieved at link time through special control flags passed to the linker. For example, the text section of the program is set to be located at address `0x600`. In practice this means that [.filename]#boot0# must be loaded to memory address `0x600` in order to function properly. It is worth looking at the [.filename]#Makefile# for [.filename]#boot0# ([.filename]#stand/i386/boot0/Makefile#), as it defines some of the run-time behavior of [.filename]#boot0#. For instance, if a terminal connected to the serial port (COM1) is used for I/O, the macro `SIO` must be defined (`-DSIO`). `-DPXE` enables boot through PXE by pressing kbd:[F6]. Additionally, the program defines a set of _flags_ that allow further modification of its behavior. All of this is illustrated in the [.filename]#Makefile#. For example, look at the linker directives which command the linker to start the text section at address `0x600`, and to build the output file "as is" (strip out any file formatting): [.programlisting] .... BOOT_BOOT0_ORG?=0x600 ORG=${BOOT_BOOT0_ORG} .... .[.filename]#stand/i386/boot0/Makefile# [[boot-boot0-makefile-as-is]] Let us now start our study of the MBR, or [.filename]#boot0#, starting where execution begins. [NOTE] ==== Some modifications have been made to some instructions in favor of better exposition. For example, some macros are expanded, and some macro tests are omitted when the result of the test is known. This applies to all of the code examples shown. ==== [.programlisting] .... start: cld # String ops inc xorw %ax,%ax # Zero movw %ax,%es # Address movw %ax,%ds # data movw %ax,%ss # Set up movw $LOAD,%sp # stack .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-entrypoint]] This first block of code is the entry point of the program. It is where the BIOS transfers control. -First, it makes sure that the string operations autoincrement its pointer operands (the `cld` instruction) footnote:[When in doubt, we refer the reader to the official Intel manuals, which describe the exact semantics for each instruction: .]. +First, it makes sure that the string operations autoincrement its pointer operands (the `cld` instruction) footnote:[When in doubt, we refer the reader to the official Intel manuals, which describe the exact semantics for each instruction.]. Then, as it makes no assumption about the state of the segment registers, it initializes them. Finally, it sets the stack pointer register (`%sp`) to ($LOAD = address `0x7c00`), so we have a working stack. The next block is responsible for the relocation and subsequent jump to the relocated code. [.programlisting] .... movw %sp,%si # Source movw $start,%di # Destination movw $0x100,%cx # Word count rep # Relocate movsw # code movw %di,%bp # Address variables movb $0x8,%cl # Words to clear rep # Zero stosw # them incb -0xe(%di) # Set the S field to 1 jmp main-LOAD+ORIGIN # Jump to relocated code .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-relocation]] As [.filename]#boot0# is loaded by the BIOS to address `0x7C00`, it copies itself to address `0x600` and then transfers control there (recall that it was linked to execute at address `0x600`). The source address, `0x7c00`, is copied to register `%si`. The destination address, `0x600`, to register `%di`. The number of words to copy, `256` (the program's size = 512 bytes), is copied to register `%cx`. Next, the `rep` instruction repeats the instruction that follows, that is, `movsw`, the number of times dictated by the `%cx` register. The `movsw` instruction copies the word pointed to by `%si` to the address pointed to by `%di`. This is repeated another 255 times. On each repetition, both the source and destination registers, `%si` and `%di`, are incremented by one. Thus, upon completion of the 256-word (512-byte) copy, `%di` has the value `0x600`+`512`= `0x800`, and `%si` has the value `0x7c00`+`512`= `0x7e00`; we have thus completed the code _relocation_. Since the last update of this document, the copy instructions have changed in the code, so instead of the movsb and stosb, movsw and stosw have been introduced, which copy 2 bytes(1 word) in one iteration. Next, the destination register `%di` is copied to `%bp`. `%bp` gets the value `0x800`. The value `8` is copied to `%cl` in preparation for a new string operation (like our previous `movsw`). Now, `stosw` is executed 8 times. This instruction copies a `0` value to the address pointed to by the destination register (`%di`, which is `0x800`), and increments it. This is repeated another 7 times, so `%di` ends up with value `0x810`. Effectively, this clears the address range `0x800`-`0x80f`. This range is used as a (fake) partition table for writing the MBR back to disk. Finally, the sector field for the CHS addressing of this fake partition is given the value 1 and a jump is made to the main function from the relocated code. Note that until this jump to the relocated code, any reference to an absolute address was avoided. The following code block tests whether the drive number provided by the BIOS should be used, or the one stored in [.filename]#boot0#. [.programlisting] .... main: testb $SETDRV,_FLAGS(%bp) # Set drive number? #ifndef CHECK_DRIVE /* disable drive checks */ jz save_curdrive # no, use the default #else jnz disable_update # Yes testb %dl,%dl # Drive number valid? js save_curdrive # Possibly (0x80 set) #endif .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-drivenumber]] This code tests the `SETDRV` bit (`0x20`) in the _flags_ variable. Recall that register `%bp` points to address location `0x800`, so the test is done to the _flags_ variable at address `0x800`-`69`= `0x7bb`. This is an example of the type of modifications that can be done to [.filename]#boot0#. The `SETDRV` flag is not set by default, but it can be set in the [.filename]#Makefile#. When set, the drive number stored in the MBR is used instead of the one provided by the BIOS. We assume the defaults, and that the BIOS provided a valid drive number, so we jump to `save_curdrive`. The next block saves the drive number provided by the BIOS, and calls `putn` to print a new line on the screen. [.programlisting] .... save_curdrive: movb %dl, (%bp) # Save drive number pushw %dx # Also in the stack #ifdef TEST /* test code, print internal bios drive */ rolb $1, %dl movw $drive, %si call putkey #endif callw putn # Print a newline .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-savedrivenumber]] Note that we assume `TEST` is not defined, so the conditional code in it is not assembled and will not appear in our executable [.filename]#boot0#. Our next block implements the actual scanning of the partition table. It prints to the screen the partition type for each of the four entries in the partition table. It compares each type with a list of well-known operating system file systems. Examples of recognized partition types are NTFS (Windows(R), ID 0x7), `ext2fs` (Linux(R), ID 0x83), and, of course, `ffs`/`ufs2` (FreeBSD, ID 0xa5). The implementation is fairly simple. [.programlisting] .... movw $(partbl+0x4),%bx # Partition table (+4) xorw %dx,%dx # Item number read_entry: movb %ch,-0x4(%bx) # Zero active flag (ch == 0) btw %dx,_FLAGS(%bp) # Entry enabled? jnc next_entry # No movb (%bx),%al # Load type test %al, %al # skip empty partition jz next_entry movw $bootable_ids,%di # Lookup tables movb $(TLEN+1),%cl # Number of entries repne # Locate scasb # type addw $(TLEN-1), %di # Adjust movb (%di),%cl # Partition addw %cx,%di # description callw putx # Display it next_entry: incw %dx # Next item addb $0x10,%bl # Next entry jnc read_entry # Till done .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-partition-scan]] It is important to note that the active flag for each entry is cleared, so after the scanning, _no_ partition entry is active in our memory copy of [.filename]#boot0#. Later, the active flag will be set for the selected partition. This ensures that only one active partition exists if the user chooses to write the changes back to disk. The next block tests for other drives. At startup, the BIOS writes the number of drives present in the computer to address `0x475`. If there are any other drives present, [.filename]#boot0# prints the current drive to screen. The user may command [.filename]#boot0# to scan partitions on another drive later. [.programlisting] .... popw %ax # Drive number subb $0x80-0x1,%al # Does next cmpb NHRDRV,%al # drive exist? (from BIOS?) jb print_drive # Yes decw %ax # Already drive 0? jz print_prompt # Yes .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-test-drives]] We make the assumption that a single drive is present, so the jump to `print_drive` is not performed. We also assume nothing strange happened, so we jump to `print_prompt`. This next block just prints out a prompt followed by the default option: [.programlisting] .... print_prompt: movw $prompt,%si # Display callw putstr # prompt movb _OPT(%bp),%dl # Display decw %si # default callw putkey # key jmp start_input # Skip beep .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-prompt]] Finally, a jump is performed to `start_input`, where the BIOS services are used to start a timer and for reading user input from the keyboard; if the timer expires, the default option will be selected: [.programlisting] .... start_input: xorb %ah,%ah # BIOS: Get int $0x1a # system time movw %dx,%di # Ticks when addw _TICKS(%bp),%di # timeout read_key: movb $0x1,%ah # BIOS: Check int $0x16 # for keypress jnz got_key # Have input xorb %ah,%ah # BIOS: int 0x1a, 00 int $0x1a # get system time cmpw %di,%dx # Timeout? jb read_key # No .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-start-input]] An interrupt is requested with number `0x1a` and argument `0` in register `%ah`. The BIOS has a predefined set of services, requested by applications as software-generated interrupts through the `int` instruction and receiving arguments in registers (in this case, `%ah`). Here, particularly, we are requesting the number of clock ticks since last midnight; this value is computed by the BIOS through the RTC (Real Time Clock). This clock can be programmed to work at frequencies ranging from 2 Hz to 8192 Hz. The BIOS sets it to 18.2 Hz at startup. When the request is satisfied, a 32-bit result is returned by the BIOS in registers `%cx` and `%dx` (lower bytes in `%dx`). This result (the `%dx` part) is copied to register `%di`, and the value of the `TICKS` variable is added to `%di`. This variable resides in [.filename]#boot0# at offset `_TICKS` (a negative value) from register `%bp` (which, recall, points to `0x800`). The default value of this variable is `0xb6` (182 in decimal). Now, the idea is that [.filename]#boot0# constantly requests the time from the BIOS, and when the value returned in register `%dx` is greater than the value stored in `%di`, the time is up and the default selection will be made. Since the RTC ticks 18.2 times per second, this condition will be met after 10 seconds (this default behavior can be changed in the [.filename]#Makefile#). Until this time has passed, [.filename]#boot0# continually asks the BIOS for any user input; this is done through `int 0x16`, argument `1` in `%ah`. Whether a key was pressed or the time expired, subsequent code validates the selection. Based on the selection, the register `%si` is set to point to the appropriate partition entry in the partition table. This new selection overrides the previous default one. Indeed, it becomes the new default. Finally, the ACTIVE flag of the selected partition is set. If it was enabled at compile time, the in-memory version of [.filename]#boot0# with these modified values is written back to the MBR on disk. We leave the details of this implementation to the reader. We now end our study with the last code block from the [.filename]#boot0# program: [.programlisting] .... movw $LOAD,%bx # Address for read movb $0x2,%ah # Read sector callw intx13 # from disk jc beep # If error cmpw $MAGIC,0x1fe(%bx) # Bootable? jne beep # No pushw %si # Save ptr to selected part. callw putn # Leave some space popw %si # Restore, next stage uses it jmp *%bx # Invoke bootstrap .... .[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-check-bootable]] Recall that `%si` points to the selected partition entry. This entry tells us where the partition begins on disk. We assume, of course, that the partition selected is actually a FreeBSD slice. [NOTE] ==== From now on, we will favor the use of the technically more accurate term "slice" rather than "partition". ==== The transfer buffer is set to `0x7c00` (register `%bx`), and a read for the first sector of the FreeBSD slice is requested by calling `intx13`. We assume that everything went okay, so a jump to `beep` is not performed. In particular, the new sector read must end with the magic sequence `0xaa55`. Finally, the value at `%si` (the pointer to the selected partition table) is preserved for use by the next stage, and a jump is performed to address `0x7c00`, where execution of our next stage (the just-read block) is started. [[boot-boot1]] == `boot1` Stage So far we have gone through the following sequence: * The BIOS did some early hardware initialization, including the POST. The MBR ([.filename]#boot0#) was loaded from absolute disk sector one to address `0x7c00`. Execution control was passed to that location. * [.filename]#boot0# relocated itself to the location it was linked to execute (`0x600`), followed by a jump to continue execution at the appropriate place. Finally, [.filename]#boot0# loaded the first disk sector from the FreeBSD slice to address `0x7c00`. Execution control was passed to that location. [.filename]#boot1# is the next step in the boot-loading sequence. It is the first of three boot stages. Note that we have been dealing exclusively with disk sectors. Indeed, the BIOS loads the absolute first sector, while [.filename]#boot0# loads the first sector of the FreeBSD slice. Both loads are to address `0x7c00`. We can conceptually think of these disk sectors as containing the files [.filename]#boot0# and [.filename]#boot1#, respectively, but in reality this is not entirely true for [.filename]#boot1#. Strictly speaking, unlike [.filename]#boot0#, [.filename]#boot1# is not part of the boot blocks footnote:[There is a file /boot/boot1, but it is not the written to the beginning of the FreeBSD slice. Instead, it is concatenated with boot2 to form boot, which is written to the beginning of the FreeBSD slice and read at boot time.]. Instead, a single, full-blown file, [.filename]#boot# ([.filename]#/boot/boot#), is what ultimately is written to disk. This file is a combination of [.filename]#boot1#, [.filename]#boot2# and the `Boot Extender` (or BTX). This single file is greater in size than a single sector (greater than 512 bytes). Fortunately, [.filename]#boot1# occupies _exactly_ the first 512 bytes of this single file, so when [.filename]#boot0# loads the first sector of the FreeBSD slice (512 bytes), it is actually loading [.filename]#boot1# and transferring control to it. The main task of [.filename]#boot1# is to load the next boot stage. This next stage is somewhat more complex. It is composed of a server called the "Boot Extender", or BTX, and a client, called [.filename]#boot2#. As we will see, the last boot stage, [.filename]#loader#, is also a client of the BTX server. Let us now look in detail at what exactly is done by [.filename]#boot1#, starting like we did for [.filename]#boot0#, at its entry point: [.programlisting] .... start: jmp main .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-entry]] The entry point at `start` simply jumps past a special data area to the label `main`, which in turn looks like this: [.programlisting] .... main: cld # String ops inc xor %cx,%cx # Zero mov %cx,%es # Address mov %cx,%ds # data mov %cx,%ss # Set up mov $start,%sp # stack mov %sp,%si # Source mov $MEM_REL,%di # Destination incb %ch # Word count rep # Copy movsw # code .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main]] Just like [.filename]#boot0#, this code relocates [.filename]#boot1#, this time to memory address `0x700`. However, unlike [.filename]#boot0#, it does not jump there. [.filename]#boot1# is linked to execute at address `0x7c00`, effectively where it was loaded in the first place. The reason for this relocation will be discussed shortly. Next comes a loop that looks for the FreeBSD slice. Although [.filename]#boot0# loaded [.filename]#boot1# from the FreeBSD slice, no information was passed to it about this footnote:[Actually we did pass a pointer to the slice entry in register %si. However, boot1 does not assume that it was loaded by boot0 (perhaps some other MBR loaded it, and did not pass this information), so it assumes nothing.], so [.filename]#boot1# must rescan the partition table to find where the FreeBSD slice starts. Therefore it rereads the MBR: [.programlisting] .... mov $part4,%si # Partition cmpb $0x80,%dl # Hard drive? jb main.4 # No movb $0x1,%dh # Block count callw nread # Read MBR .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-freebsd]] In the code above, register `%dl` maintains information about the boot device. This is passed on by the BIOS and preserved by the MBR. Numbers `0x80` and greater tells us that we are dealing with a hard drive, so a call is made to `nread`, where the MBR is read. Arguments to `nread` are passed through `%si` and `%dh`. The memory address at label `part4` is copied to `%si`. This memory address holds a "fake partition" to be used by `nread`. The following is the data in the fake partition: [.programlisting] .... part4: .byte 0x80, 0x00, 0x01, 0x00 .byte 0xa5, 0xfe, 0xff, 0xff .byte 0x00, 0x00, 0x00, 0x00 .byte 0x50, 0xc3, 0x00, 0x00 .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot2-make-fake-partition]] In particular, the LBA for this fake partition is hardcoded to zero. This is used as an argument to the BIOS for reading absolute sector one from the hard drive. Alternatively, CHS addressing could be used. In this case, the fake partition holds cylinder 0, head 0 and sector 1, which is equivalent to absolute sector one. Let us now proceed to take a look at `nread`: [.programlisting] .... nread: mov $MEM_BUF,%bx # Transfer buffer mov 0x8(%si),%ax # Get mov 0xa(%si),%cx # LBA push %cs # Read from callw xread.1 # disk jnc return # If success, return .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-nread]] Recall that `%si` points to the fake partition. The word footnote:[In the context of 16-bit real mode, a word is 2 bytes.] at offset `0x8` is copied to register `%ax` and word at offset `0xa` to `%cx`. They are interpreted by the BIOS as the lower 4-byte value denoting the LBA to be read (the upper four bytes are assumed to be zero). Register `%bx` holds the memory address where the MBR will be loaded. The instruction pushing `%cs` onto the stack is very interesting. In this context, it accomplishes nothing. However, as we will see shortly, [.filename]#boot2#, in conjunction with the BTX server, also uses `xread.1`. This mechanism will be discussed in the next section. The code at `xread.1` further calls the `read` function, which actually calls the BIOS asking for the disk sector: [.programlisting] .... xread.1: pushl $0x0 # absolute push %cx # block push %ax # number push %es # Address of push %bx # transfer buffer xor %ax,%ax # Number of movb %dh,%al # blocks to push %ax # transfer push $0x10 # Size of packet mov %sp,%bp # Packet pointer callw read # Read from disk lea 0x10(%bp),%sp # Clear stack lret # To far caller .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-xread1]] Note the long return instruction at the end of this block. This instruction pops out the `%cs` register pushed by `nread`, and returns. Finally, `nread` also returns. With the MBR loaded to memory, the actual loop for searching the FreeBSD slice begins: [.programlisting] .... mov $0x1,%cx # Two passes main.1: mov $MEM_BUF+PRT_OFF,%si # Partition table movb $0x1,%dh # Partition main.2: cmpb $PRT_BSD,0x4(%si) # Our partition type? jne main.3 # No jcxz main.5 # If second pass testb $0x80,(%si) # Active? jnz main.5 # Yes main.3: add $0x10,%si # Next entry incb %dh # Partition cmpb $0x1+PRT_NUM,%dh # In table? jb main.2 # Yes dec %cx # Do two jcxz main.1 # passes .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-part]] If a FreeBSD slice is identified, execution continues at `main.5`. Note that when a FreeBSD slice is found `%si` points to the appropriate entry in the partition table, and `%dh` holds the partition number. We assume that a FreeBSD slice is found, so we continue execution at `main.5`: [.programlisting] .... main.5: mov %dx,MEM_ARG # Save args movb $NSECT,%dh # Sector count callw nread # Read disk mov $MEM_BTX,%bx # BTX mov 0xa(%bx),%si # Get BTX length and set add %bx,%si # %si to start of boot2.bin mov $MEM_USR+SIZ_PAG*2,%di # Client page 2 mov $MEM_BTX+(NSECT-1)*SIZ_SEC,%cx # Byte sub %si,%cx # count rep # Relocate movsb # client .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main5]] Recall that at this point, register `%si` points to the FreeBSD slice entry in the MBR partition table, so a call to `nread` will effectively read sectors at the beginning of this partition. The argument passed on register `%dh` tells `nread` to read 16 disk sectors. Recall that the first 512 bytes, or the first sector of the FreeBSD slice, coincides with the [.filename]#boot1# program. Also recall that the file written to the beginning of the FreeBSD slice is not [.filename]#/boot/boot1#, but [.filename]#/boot/boot#. Let us look at the size of these files in the filesystem: [source,bash] .... -r--r--r-- 1 root wheel 512B Jan 8 00:15 /boot/boot0 -r--r--r-- 1 root wheel 512B Jan 8 00:15 /boot/boot1 -r--r--r-- 1 root wheel 7.5K Jan 8 00:15 /boot/boot2 -r--r--r-- 1 root wheel 8.0K Jan 8 00:15 /boot/boot .... Both [.filename]#boot0# and [.filename]#boot1# are 512 bytes each, so they fit _exactly_ in one disk sector. [.filename]#boot2# is much bigger, holding both the BTX server and the [.filename]#boot2# client. Finally, a file called simply [.filename]#boot# is 512 bytes larger than [.filename]#boot2#. This file is a concatenation of [.filename]#boot1# and [.filename]#boot2#. As already noted, [.filename]#boot0# is the file written to the absolute first disk sector (the MBR), and [.filename]#boot# is the file written to the first sector of the FreeBSD slice; [.filename]#boot1# and [.filename]#boot2# are _not_ written to disk. The command used to concatenate [.filename]#boot1# and [.filename]#boot2# into a single [.filename]#boot# is merely `cat boot1 boot2 > boot`. So [.filename]#boot1# occupies exactly the first 512 bytes of [.filename]#boot# and, because [.filename]#boot# is written to the first sector of the FreeBSD slice, [.filename]#boot1# fits exactly in this first sector. When `nread` reads the first 16 sectors of the FreeBSD slice, it effectively reads the entire [.filename]#boot# file footnote:[512*16=8192 bytes, exactly the size of boot]. We will see more details about how [.filename]#boot# is formed from [.filename]#boot1# and [.filename]#boot2# in the next section. Recall that `nread` uses memory address `0x8c00` as the transfer buffer to hold the sectors read. This address is conveniently chosen. Indeed, because [.filename]#boot1# belongs to the first 512 bytes, it ends up in the address range `0x8c00`-`0x8dff`. The 512 bytes that follows (range `0x8e00`-`0x8fff`) is used to store the _bsdlabel_ footnote:[Historically known as disklabel. If you ever wondered where FreeBSD stored this information, it is in this region - see man:bsdlabel[8]]. Starting at address `0x9000` is the beginning of the BTX server, and immediately following is the [.filename]#boot2# client. The BTX server acts as a kernel, and executes in protected mode in the most privileged level. In contrast, the BTX clients ([.filename]#boot2#, for example), execute in user mode. We will see how this is accomplished in the next section. The code after the call to `nread` locates the beginning of [.filename]#boot2# in the memory buffer, and copies it to memory address `0xc000`. This is because the BTX server arranges [.filename]#boot2# to execute in a segment starting at `0xa000`. We explore this in detail in the following section. -The last code block of [.filename]#boot1# enables access to memory above 1MB footnote:[This is necessary for legacy reasons. -Interested readers should see .] and concludes with a jump to the starting point of the BTX server: +The last code block of [.filename]#boot1# enables access to memory above 1MB footnote:[This is necessary for legacy reasons.] +and concludes with a jump to the starting point of the BTX server: [.programlisting] .... seta20: cli # Disable interrupts seta20.1: dec %cx # Timeout? jz seta20.3 # Yes inb $0x64,%al # Get status testb $0x2,%al # Busy? jnz seta20.1 # Yes movb $0xd1,%al # Command: Write outb %al,$0x64 # output port seta20.2: inb $0x64,%al # Get status testb $0x2,%al # Busy? jnz seta20.2 # Yes movb $0xdf,%al # Enable outb %al,$0x60 # A20 seta20.3: sti # Enable interrupts jmp 0x9010 # Start BTX .... .[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-seta20]] Note that right before the jump, interrupts are enabled. [[btx-server]] == The BTX Server Next in our boot sequence is the BTX Server. Let us quickly remember how we got here: * The BIOS loads the absolute sector one (the MBR, or [.filename]#boot0#), to address `0x7c00` and jumps there. * [.filename]#boot0# relocates itself to `0x600`, the address it was linked to execute, and jumps over there. It then reads the first sector of the FreeBSD slice (which consists of [.filename]#boot1#) into address `0x7c00` and jumps over there. * [.filename]#boot1# loads the first 16 sectors of the FreeBSD slice into address `0x8c00`. This 16 sectors, or 8192 bytes, is the whole file [.filename]#boot#. The file is a concatenation of [.filename]#boot1# and [.filename]#boot2#. [.filename]#boot2#, in turn, contains the BTX server and the [.filename]#boot2# client. Finally, a jump is made to address `0x9010`, the entry point of the BTX server. Before studying the BTX Server in detail, let us further review how the single, all-in-one [.filename]#boot# file is created. The way [.filename]#boot# is built is defined in its [.filename]#Makefile# ([.filename]#stand/i386/boot2/Makefile#). Let us look at the rule that creates the [.filename]#boot# file: [.programlisting] .... boot: boot1 boot2 cat boot1 boot2 > boot .... .[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot]] This tells us that [.filename]#boot1# and [.filename]#boot2# are needed, and the rule simply concatenates them to produce a single file called [.filename]#boot#. The rules for creating [.filename]#boot1# are also quite simple: [.programlisting] .... boot1: boot1.out ${OBJCOPY} -S -O binary boot1.out ${.TARGET} boot1.out: boot1.o ${LD} ${LD_FLAGS} -e start --defsym ORG=${ORG1} -T ${LDSCRIPT} -o ${.TARGET} boot1.o .... .[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot1]] To apply the rule for creating [.filename]#boot1#, [.filename]#boot1.out# must be resolved. This, in turn, depends on the existence of [.filename]#boot1.o#. This last file is simply the result of assembling our familiar [.filename]#boot1.S#, without linking. Now, the rule for creating [.filename]#boot1.out# is applied. This tells us that [.filename]#boot1.o# should be linked with `start` as its entry point, and starting at address `0x7c00`. Finally, [.filename]#boot1# is created from [.filename]#boot1.out# applying the appropriate rule. This rule is the [.filename]#objcopy# command applied to [.filename]#boot1.out#. Note the flags passed to [.filename]#objcopy#: `-S` tells it to strip all relocation and symbolic information; `-O binary` indicates the output format, that is, a simple, unformatted binary file. Having [.filename]#boot1#, let us take a look at how [.filename]#boot2# is constructed: [.programlisting] .... boot2: boot2.ld @set -- `ls -l ${.ALLSRC}`; x=$$((${BOOT2SIZE}-$$5)); \ echo "$$x bytes available"; test $$x -ge 0 ${DD} if=${.ALLSRC} of=${.TARGET} bs=${BOOT2SIZE} conv=sync boot2.ld: boot2.ldr boot2.bin ${BTXKERN} btxld -v -E ${ORG2} -f bin -b ${BTXKERN} -l boot2.ldr \ -o ${.TARGET} -P 1 boot2.bin boot2.ldr: ${DD} if=/dev/zero of=${.TARGET} bs=512 count=1 boot2.bin: boot2.out ${OBJCOPY} -S -O binary boot2.out ${.TARGET} boot2.out: ${BTXCRT} boot2.o sio.o ashldi3.o ${LD} ${LD_FLAGS} --defsym ORG=${ORG2} -T ${LDSCRIPT} -o ${.TARGET} ${.ALLSRC} boot2.h: boot1.out ${NM} -t d ${.ALLSRC} | awk '/([0-9])+ T xread/ \ { x = $$1 - ORG1; \ printf("#define XREADORG %#x\n", REL1 + x) }' \ ORG1=`printf "%d" ${ORG1}` \ REL1=`printf "%d" ${REL1}` > ${.TARGET} .... .[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2]] The mechanism for building [.filename]#boot2# is far more elaborate. Let us point out the most relevant facts. The dependency list is as follows: [.programlisting] .... boot2: boot2.ld boot2.ld: boot2.ldr boot2.bin ${BTXDIR} boot2.bin: boot2.out boot2.out: ${BTXDIR} boot2.o sio.o ashldi3.o boot2.h: boot1.out .... .[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2-more]] Note that initially there is no header file [.filename]#boot2.h#, but its creation depends on [.filename]#boot1.out#, which we already have. The rule for its creation is a bit terse, but the important thing is that the output, [.filename]#boot2.h#, is something like this: [.programlisting] .... #define XREADORG 0x725 .... .[.filename]#stand/i386/boot2/boot2.h# [[boot-boot1-make-boot2h]] Recall that [.filename]#boot1# was relocated (i.e., copied from `0x7c00` to `0x700`). This relocation will now make sense, because as we will see, the BTX server reclaims some memory, including the space where [.filename]#boot1# was originally loaded. However, the BTX server needs access to [.filename]#boot1#'s `xread` function; this function, according to the output of [.filename]#boot2.h#, is at location `0x725`. Indeed, the BTX server uses the `xread` function from [.filename]#boot1#'s relocated code. This function is now accessible from within the [.filename]#boot2# client. The next rule directs the linker to link various files ([.filename]#ashldi3.o#, [.filename]#boot2.o# and [.filename]#sio.o#). Note that the output file, [.filename]#boot2.out#, is linked to execute at address `0x2000` (${ORG2}). Recall that [.filename]#boot2# will be executed in user mode, within a special user segment set up by the BTX server. This segment starts at `0xa000`. Also, remember that the [.filename]#boot2# portion of [.filename]#boot# was copied to address `0xc000`, that is, offset `0x2000` from the start of the user segment, so [.filename]#boot2# will work properly when we transfer control to it. Next, [.filename]#boot2.bin# is created from [.filename]#boot2.out# by stripping its symbols and format information; boot2.bin is a _raw_ binary. Now, note that a file [.filename]#boot2.ldr# is created as a 512-byte file full of zeros. This space is reserved for the bsdlabel. Now that we have files [.filename]#boot1#, [.filename]#boot2.bin# and [.filename]#boot2.ldr#, only the BTX server is missing before creating the all-in-one [.filename]#boot# file. The BTX server is located in [.filename]#stand/i386/btx/btx#; it has its own [.filename]#Makefile# with its own set of rules for building. The important thing to notice is that it is also compiled as a _raw_ binary, and that it is linked to execute at address `0x9000`. The details can be found in [.filename]#stand/i386/btx/btx/Makefile#. Having the files that comprise the [.filename]#boot# program, the final step is to _merge_ them. This is done by a special program called [.filename]#btxld# (source located in [.filename]#/usr/src/usr.sbin/btxld#). Some arguments to this program include the name of the output file ([.filename]#boot#), its entry point (`0x2000`) and its file format (raw binary). The various files are finally merged by this utility into the file [.filename]#boot#, which consists of [.filename]#boot1#, [.filename]#boot2#, the `bsdlabel` and the BTX server. This file, which takes exactly 16 sectors, or 8192 bytes, is what is actually written to the beginning of the FreeBSD slice during installation. Let us now proceed to study the BTX server program. The BTX server prepares a simple environment and switches from 16-bit real mode to 32-bit protected mode, right before passing control to the client. This includes initializing and updating the following data structures: * Modifies the `Interrupt Vector Table (IVT)`. The IVT provides exception and interrupt handlers for Real-Mode code. * The `Interrupt Descriptor Table (IDT)` is created. Entries are provided for processor exceptions, hardware interrupts, two system calls and V86 interface. The IDT provides exception and interrupt handlers for Protected-Mode code. * A `Task-State Segment (TSS)` is created. This is necessary because the processor works in the _least_ privileged level when executing the client ([.filename]#boot2#), but in the _most_ privileged level when executing the BTX server. * The GDT (Global Descriptor Table) is set up. Entries (descriptors) are provided for supervisor code and data, user code and data, and real-mode code and data. footnote:[Real-mode code and data are necessary when switching back to real mode from protected mode, as suggested by the Intel manuals.] Let us now start studying the actual implementation. Recall that [.filename]#boot1# made a jump to address `0x9010`, the BTX server's entry point. Before studying program execution there, note that the BTX server has a special header at address range `0x9000-0x900f`, right before its entry point. This header is defined as follows: [.programlisting] .... start: # Start of code /* * BTX header. */ btx_hdr: .byte 0xeb # Machine ID .byte 0xe # Header size .ascii "BTX" # Magic .byte 0x1 # Major version .byte 0x2 # Minor version .byte BTX_FLAGS # Flags .word PAG_CNT-MEM_ORG>>0xc # Paging control .word break-start # Text size .long 0x0 # Entry address .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-header]] Note the first two bytes are `0xeb` and `0xe`. In the IA-32 architecture, these two bytes are interpreted as a relative jump past the header into the entry point, so in theory, [.filename]#boot1# could jump here (address `0x9000`) instead of address `0x9010`. Note that the last field in the BTX header is a pointer to the client's ([.filename]#boot2#) entry pointb2. This field is patched at link time. Immediately following the header is the BTX server's entry point: [.programlisting] .... /* * Initialization routine. */ init: cli # Disable interrupts xor %ax,%ax # Zero/segment mov %ax,%ss # Set up mov $MEM_ESP0,%sp # stack mov %ax,%es # Address mov %ax,%ds # data pushl $0x2 # Clear popfl # flags .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-init]] This code disables interrupts, sets up a working stack (starting at address `0x1800`) and clears the flags in the EFLAGS register. Note that the `popfl` instruction pops out a doubleword (4 bytes) from the stack and places it in the EFLAGS register. As the value actually popped is `2`, the EFLAGS register is effectively cleared (IA-32 requires that bit 2 of the EFLAGS register always be 1). Our next code block clears (sets to `0`) the memory range `0x5e00-0x8fff`. This range is where the various data structures will be created: [.programlisting] .... /* * Initialize memory. */ mov $MEM_IDT,%di # Memory to initialize mov $(MEM_ORG-MEM_IDT)/2,%cx # Words to zero rep # Zero-fill stosw # memory .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-clear-mem]] Recall that [.filename]#boot1# was originally loaded to address `0x7c00`, so, with this memory initialization, that copy effectively disappeared. However, also recall that [.filename]#boot1# was relocated to `0x700`, so _that_ copy is still in memory, and the BTX server will make use of it. Next, the real-mode IVT (Interrupt Vector Table is updated. The IVT is an array of segment/offset pairs for exception and interrupt handlers. The BIOS normally maps hardware interrupts to interrupt vectors `0x8` to `0xf` and `0x70` to `0x77` but, as will be seen, the 8259A Programmable Interrupt Controller, the chip controlling the actual mapping of hardware interrupts to interrupt vectors, is programmed to remap these interrupt vectors from `0x8-0xf` to `0x20-0x27` and from `0x70-0x77` to `0x28-0x2f`. Thus, interrupt handlers are provided for interrupt vectors `0x20-0x2f`. The reason the BIOS-provided handlers are not used directly is because they work in 16-bit real mode, but not 32-bit protected mode. Processor mode will be switched to 32-bit protected mode shortly. However, the BTX server sets up a mechanism to effectively use the handlers provided by the BIOS: [.programlisting] .... /* * Update real mode IDT for reflecting hardware interrupts. */ mov $intr20,%bx # Address first handler mov $0x10,%cx # Number of handlers mov $0x20*4,%di # First real mode IDT entry init.0: mov %bx,(%di) # Store IP inc %di # Address next inc %di # entry stosw # Store CS add $4,%bx # Next handler loop init.0 # Next IRQ .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-ivt]] The next block creates the IDT (Interrupt Descriptor Table). The IDT is analogous, in protected mode, to the IVT in real mode. That is, the IDT describes the various exception and interrupt handlers used when the processor is executing in protected mode. In essence, it also consists of an array of segment/offset pairs, although the structure is somewhat more complex, because segments in protected mode are different than in real mode, and various protection mechanisms apply: [.programlisting] .... /* * Create IDT. */ mov $MEM_IDT,%di # IDT's address mov $idtctl,%si # Control string init.1: lodsb # Get entry cbw # count xchg %ax,%cx # as word jcxz init.4 # If done lodsb # Get segment xchg %ax,%dx # P:DPL:type lodsw # Get control xchg %ax,%bx # set lodsw # Get handler offset mov $SEL_SCODE,%dh # Segment selector init.2: shr %bx # Handle this int? jnc init.3 # No mov %ax,(%di) # Set handler offset mov %dh,0x2(%di) # and selector mov %dl,0x5(%di) # Set P:DPL:type add $0x4,%ax # Next handler init.3: lea 0x8(%di),%di # Next entry loop init.2 # Till set done jmp init.1 # Continue .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-idt]] Each entry in the `IDT` is 8 bytes long. Besides the segment/offset information, they also describe the segment type, privilege level, and whether the segment is present in memory or not. The construction is such that interrupt vectors from `0` to `0xf` (exceptions) are handled by function `intx00`; vector `0x10` (also an exception) is handled by `intx10`; hardware interrupts, which are later configured to start at interrupt vector `0x20` all the way to interrupt vector `0x2f`, are handled by function `intx20`. Lastly, interrupt vector `0x30`, which is used for system calls, is handled by `intx30`, and vectors `0x31` and `0x32` are handled by `intx31`. It must be noted that only descriptors for interrupt vectors `0x30`, `0x31` and `0x32` are given privilege level 3, the same privilege level as the [.filename]#boot2# client, which means the client can execute a software-generated interrupt to this vectors through the `int` instruction without failing (this is the way [.filename]#boot2# use the services provided by the BTX server). Also, note that _only_ software-generated interrupts are protected from code executing in lesser privilege levels. Hardware-generated interrupts and processor-generated exceptions are _always_ handled adequately, regardless of the actual privileges involved. The next step is to initialize the TSS (Task-State Segment). The TSS is a hardware feature that helps the operating system or executive software implement multitasking functionality through process abstraction. The IA-32 architecture demands the creation and use of _at least_ one TSS if multitasking facilities are used or different privilege levels are defined. Since the [.filename]#boot2# client is executed in privilege level 3, but the BTX server runs in privilege level 0, a TSS must be defined: [.programlisting] .... /* * Initialize TSS. */ init.4: movb $_ESP0H,TSS_ESP0+1(%di) # Set ESP0 movb $SEL_SDATA,TSS_SS0(%di) # Set SS0 movb $_TSSIO,TSS_MAP(%di) # Set I/O bit map base .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-tss]] Note that a value is given for the Privilege Level 0 stack pointer and stack segment in the TSS. This is needed because, if an interrupt or exception is received while executing [.filename]#boot2# in Privilege Level 3, a change to Privilege Level 0 is automatically performed by the processor, so a new working stack is needed. Finally, the I/O Map Base Address field of the TSS is given a value, which is a 16-bit offset from the beginning of the TSS to the I/O Permission Bitmap and the Interrupt Redirection Bitmap. After the IDT and TSS are created, the processor is ready to switch to protected mode. This is done in the next block: [.programlisting] .... /* * Bring up the system. */ mov $0x2820,%bx # Set protected mode callw setpic # IRQ offsets lidt idtdesc # Set IDT lgdt gdtdesc # Set GDT mov %cr0,%eax # Switch to protected inc %ax # mode mov %eax,%cr0 # ljmp $SEL_SCODE,$init.8 # To 32-bit code .code32 init.8: xorl %ecx,%ecx # Zero movb $SEL_SDATA,%cl # To 32-bit movw %cx,%ss # stack .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-prot]] First, a call is made to `setpic` to program the 8259A PIC (Programmable Interrupt Controller). This chip is connected to multiple hardware interrupt sources. Upon receiving an interrupt from a device, it signals the processor with the appropriate interrupt vector. This can be customized so that specific interrupts are associated with specific interrupt vectors, as explained before. Next, the IDTR (Interrupt Descriptor Table Register) and GDTR (Global Descriptor Table Register) are loaded with the instructions `lidt` and `lgdt`, respectively. These registers are loaded with the base address and limit address for the IDT and GDT. The following three instructions set the Protection Enable (PE) bit of the `%cr0` register. This effectively switches the processor to 32-bit protected mode. Next, a long jump is made to `init.8` using segment selector SEL_SCODE, which selects the Supervisor Code Segment. The processor is effectively executing in CPL 0, the most privileged level, after this jump. Finally, the Supervisor Data Segment is selected for the stack by assigning the segment selector SEL_SDATA to the `%ss` register. This data segment also has a privilege level of `0`. Our last code block is responsible for loading the TR (Task Register) with the segment selector for the TSS we created earlier, and setting the User Mode environment before passing execution control to the [.filename]#boot2# client. [.programlisting] .... /* * Launch user task. */ movb $SEL_TSS,%cl # Set task ltr %cx # register movl $MEM_USR,%edx # User base address movzwl %ss:BDA_MEM,%eax # Get free memory shll $0xa,%eax # To bytes subl $ARGSPACE,%eax # Less arg space subl %edx,%eax # Less base movb $SEL_UDATA,%cl # User data selector pushl %ecx # Set SS pushl %eax # Set ESP push $0x202 # Set flags (IF set) push $SEL_UCODE # Set CS pushl btx_hdr+0xc # Set EIP pushl %ecx # Set GS pushl %ecx # Set FS pushl %ecx # Set DS pushl %ecx # Set ES pushl %edx # Set EAX movb $0x7,%cl # Set remaining init.9: push $0x0 # general loop init.9 # registers #ifdef BTX_SERIAL call sio_init # setup the serial console #endif popa # and initialize popl %es # Initialize popl %ds # user popl %fs # segment popl %gs # registers iret # To user mode .... .[.filename]#stand/i386/btx/btx/btx.S# [[btx-end]] Note that the client's environment include a stack segment selector and stack pointer (registers `%ss` and `%esp`). Indeed, once the TR is loaded with the appropriate stack segment selector (instruction `ltr`), the stack pointer is calculated and pushed onto the stack along with the stack's segment selector. Next, the value `0x202` is pushed onto the stack; it is the value that the EFLAGS will get when control is passed to the client. Also, the User Mode code segment selector and the client's entry point are pushed. Recall that this entry point is patched in the BTX header at link time. Finally, segment selectors (stored in register `%ecx`) for the segment registers `%gs, %fs, %ds and %es` are pushed onto the stack, along with the value at `%edx` (`0xa000`). Keep in mind the various values that have been pushed onto the stack (they will be popped out shortly). Next, values for the remaining general purpose registers are also pushed onto the stack (note the `loop` that pushes the value `0` seven times). Now, values will be started to be popped out of the stack. First, the `popa` instruction pops out of the stack the latest seven values pushed. They are stored in the general purpose registers in order `%edi, %esi, %ebp, %ebx, %edx, %ecx, %eax`. Then, the various segment selectors pushed are popped into the various segment registers. Five values still remain on the stack. They are popped when the `iret` instruction is executed. This instruction first pops the value that was pushed from the BTX header. This value is a pointer to [.filename]#boot2#'s entry point. It is placed in the register `%eip`, the instruction pointer register. Next, the segment selector for the User Code Segment is popped and copied to register `%cs`. Remember that this segment's privilege level is 3, the least privileged level. This means that we must provide values for the stack of this privilege level. This is why the processor, besides further popping the value for the EFLAGS register, does two more pops out of the stack. These values go to the stack pointer (`%esp`) and the stack segment (`%ss`). Now, execution continues at ``boot0``'s entry point. It is important to note how the User Code Segment is defined. This segment's _base address_ is set to `0xa000`. This means that code memory addresses are _relative_ to address 0xa000; if code being executed is fetched from address `0x2000`, the _actual_ memory addressed is `0xa000+0x2000=0xc000`. [[boot2]] == boot2 Stage `boot2` defines an important structure, `struct bootinfo`. This structure is initialized by `boot2` and passed to the loader, and then further to the kernel. Some nodes of this structures are set by `boot2`, the rest by the loader. This structure, among other information, contains the kernel filename, BIOS harddisk geometry, BIOS drive number for boot device, physical memory available, `envp` pointer etc. The definition for it is: [.programlisting] .... /usr/include/machine/bootinfo.h: struct bootinfo { u_int32_t bi_version; u_int32_t bi_kernelname; /* represents a char * */ u_int32_t bi_nfs_diskless; /* struct nfs_diskless * */ /* End of fields that are always present. */ #define bi_endcommon bi_n_bios_used u_int32_t bi_n_bios_used; u_int32_t bi_bios_geom[N_BIOS_GEOM]; u_int32_t bi_size; u_int8_t bi_memsizes_valid; u_int8_t bi_bios_dev; /* bootdev BIOS unit number */ u_int8_t bi_pad[2]; u_int32_t bi_basemem; u_int32_t bi_extmem; u_int32_t bi_symtab; /* struct symtab * */ u_int32_t bi_esymtab; /* struct symtab * */ /* Items below only from advanced bootloader */ u_int32_t bi_kernend; /* end of kernel space */ u_int32_t bi_envp; /* environment */ u_int32_t bi_modulep; /* preloaded modules */ }; .... `boot2` enters into an infinite loop waiting for user input, then calls `load()`. If the user does not press anything, the loop breaks by a timeout, so `load()` will load the default file ([.filename]#/boot/loader#). Functions `ino_t lookup(char *filename)` and `int xfsread(ino_t inode, void *buf, size_t nbyte)` are used to read the content of a file into memory. [.filename]#/boot/loader# is an ELF binary, but where the ELF header is prepended with [.filename]#a.out#'s `struct exec` structure. `load()` scans the loader's ELF header, loading the content of [.filename]#/boot/loader# into memory, and passing the execution to the loader's entry: [.programlisting] .... stand/i386/boot2/boot2.c: __exec((caddr_t)addr, RB_BOOTINFO | (opts & RBX_MASK), MAKEBOOTDEV(dev_maj[dsk.type], dsk.slice, dsk.unit, dsk.part), 0, 0, 0, VTOP(&bootinfo)); .... [[boot-loader]] == loader Stage loader is a BTX client as well. I will not describe it here in detail, there is a comprehensive man page written by Mike Smith, man:loader[8]. The underlying mechanisms and BTX were discussed above. The main task for the loader is to boot the kernel. When the kernel is loaded into memory, it is being called by the loader: [.programlisting] .... stand/common/boot.c: /* Call the exec handler from the loader matching the kernel */ file_formats[fp->f_loader]->l_exec(fp); .... [[boot-kernel]] == Kernel Initialization Let us take a look at the command that links the kernel. This will help identify the exact location where the loader passes execution to the kernel. This location is the kernel's actual entry point. This command is now excluded from [.filename]#sys/conf/Makefile.i386#. The content that interests us can be found in [.filename]#/usr/obj/usr/src/i386.i386/sys/GENERIC/#. [.programlisting] .... /usr/obj/usr/src/i386.i386/sys/GENERIC/kernel.meta: ld -m elf_i386_fbsd -Bdynamic -T /usr/src/sys/conf/ldscript.i386 --build-id=sha1 --no-warn-mismatch \ --warn-common --export-dynamic --dynamic-linker /red/herring -X -o kernel locore.o .... A few interesting things can be seen here. First, the kernel is an ELF dynamically linked binary, but the dynamic linker for kernel is [.filename]#/red/herring#, which is definitely a bogus file. Second, taking a look at the file [.filename]#sys/conf/ldscript.i386# gives an idea about what ld options are used when compiling a kernel. Reading through the first few lines, the string [.programlisting] .... sys/conf/ldscript.i386: ENTRY(btext) .... says that a kernel's entry point is the symbol `btext`. This symbol is defined in [.filename]#locore.s#: [.programlisting] .... sys/i386/i386/locore.s: .text /********************************************************************** * * This is where the bootblocks start us, set the ball rolling... * */ NON_GPROF_ENTRY(btext) .... First, the register EFLAGS is set to a predefined value of 0x00000002. Then all the segment registers are initialized: [.programlisting] .... sys/i386/i386/locore.s: /* Don't trust what the BIOS gives for eflags. */ pushl $PSL_KERNEL popfl /* * Don't trust what the BIOS gives for %fs and %gs. Trust the bootstrap * to set %cs, %ds, %es and %ss. */ mov %ds, %ax mov %ax, %fs mov %ax, %gs .... btext calls the routines `recover_bootinfo()`, `identify_cpu()`, which are also defined in [.filename]#locore.s#. Here is a description of what they do: [.informaltable] [cols="1,1", frame="none"] |=== |`recover_bootinfo` |This routine parses the parameters to the kernel passed from the bootstrap. The kernel may have been booted in 3 ways: by the loader, described above, by the old disk boot blocks, or by the old diskless boot procedure. This function determines the booting method, and stores the `struct bootinfo` structure into the kernel memory. |`identify_cpu` |This function tries to find out what CPU it is running on, storing the value found in a variable `_cpu`. |=== The next steps are enabling VME, if the CPU supports it: [.programlisting] .... sys/i386/i386/mpboot.s: testl $CPUID_VME,%edx jz 3f orl $CR4_VME,%eax 3: movl %eax,%cr4 .... Then, enabling paging: [.programlisting] .... sys/i386/i386/mpboot.s: /* Now enable paging */ movl IdlePTD_nopae, %eax movl %eax,%cr3 /* load ptd addr into mmu */ movl %cr0,%eax /* get control word */ orl $CR0_PE|CR0_PG,%eax /* enable paging */ movl %eax,%cr0 /* and let's page NOW! */ .... The next three lines of code are because the paging was set, so the jump is needed to continue the execution in virtualized address space: [.programlisting] .... sys/i386/i386/mpboot.s: pushl $mp_begin /* jump to high mem */ ret /* now running relocated at KERNBASE where the system is linked to run */ mp_begin: /* now running relocated at KERNBASE */ .... The function `init386()` is called with a pointer to the first free physical page, after that `mi_startup()`. `init386` is an architecture dependent initialization function, and `mi_startup()` is an architecture independent one (the 'mi_' prefix stands for Machine Independent). The kernel never returns from `mi_startup()`, and by calling it, the kernel finishes booting: [.programlisting] .... sys/i386/i386/locore.s: pushl physfree /* value of first for init386(first) */ call init386 /* wire 386 chip for unix operation */ addl $4,%esp movl %eax,%esp /* Switch to true top of stack. */ call mi_startup /* autoconfiguration, mountroot etc */ /* NOTREACHED */ .... === `init386()` `init386()` is defined in [.filename]#sys/i386/i386/machdep.c# and performs low-level initialization specific to the i386 chip. The switch to protected mode was performed by the loader. The loader has created the very first task, in which the kernel continues to operate. Before looking at the code, consider the tasks the processor must complete to initialize protected mode execution: * Initialize the kernel tunable parameters, passed from the bootstrapping program. * Prepare the GDT. * Prepare the IDT. * Initialize the system console. * Initialize the DDB, if it is compiled into kernel. * Initialize the TSS. * Prepare the LDT. * Set up thread0's pcb. `init386()` initializes the tunable parameters passed from bootstrap by setting the environment pointer (envp) and calling `init_param1()`. The envp pointer has been passed from loader in the `bootinfo` structure: [.programlisting] .... sys/i386/i386/machdep.c: /* Init basic tunables, hz etc */ init_param1(); .... `init_param1()` is defined in [.filename]#sys/kern/subr_param.c#. That file has a number of sysctls, and two functions, `init_param1()` and `init_param2()`, that are called from `init386()`: [.programlisting] .... sys/kern/subr_param.c: hz = -1; TUNABLE_INT_FETCH("kern.hz", &hz); if (hz == -1) hz = vm_guest > VM_GUEST_NO ? HZ_VM : HZ; .... TUNABLE__FETCH is used to fetch the value from the environment: [.programlisting] .... /usr/src/sys/sys/kernel.h: #define TUNABLE_INT_FETCH(path, var) getenv_int((path), (var)) .... Sysctl `kern.hz` is the system clock tick. Additionally, these sysctls are set by `init_param1()`: `kern.maxswzone, kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz, kern.dflssiz, kern.maxssiz, kern.sgrowsiz`. Then `init386()` prepares the Global Descriptors Table (GDT). Every task on an x86 is running in its own virtual address space, and this space is addressed by a segment:offset pair. Say, for instance, the current instruction to be executed by the processor lies at CS:EIP, then the linear virtual address for that instruction would be "the virtual address of code segment CS" + EIP. For convenience, segments begin at virtual address 0 and end at a 4GB boundary. Therefore, the instruction's linear virtual address for this example would just be the value of EIP. Segment registers such as CS, DS etc are the selectors, i.e., indexes, into GDT (to be more precise, an index is not a selector itself, but the INDEX field of a selector). FreeBSD's GDT holds descriptors for 15 selectors per CPU: [.programlisting] .... sys/i386/i386/machdep.c: union descriptor gdt0[NGDT]; /* initial global descriptor table */ union descriptor *gdt = gdt0; /* global descriptor table */ sys/x86/include/segments.h: /* * Entries in the Global Descriptor Table (GDT) */ #define GNULL_SEL 0 /* Null Descriptor */ #define GPRIV_SEL 1 /* SMP Per-Processor Private Data */ #define GUFS_SEL 2 /* User %fs Descriptor (order critical: 1) */ #define GUGS_SEL 3 /* User %gs Descriptor (order critical: 2) */ #define GCODE_SEL 4 /* Kernel Code Descriptor (order critical: 1) */ #define GDATA_SEL 5 /* Kernel Data Descriptor (order critical: 2) */ #define GUCODE_SEL 6 /* User Code Descriptor (order critical: 3) */ #define GUDATA_SEL 7 /* User Data Descriptor (order critical: 4) */ #define GBIOSLOWMEM_SEL 8 /* BIOS low memory access (must be entry 8) */ #define GPROC0_SEL 9 /* Task state process slot zero and up */ #define GLDT_SEL 10 /* Default User LDT */ #define GUSERLDT_SEL 11 /* User LDT */ #define GPANIC_SEL 12 /* Task state to consider panic from */ #define GBIOSCODE32_SEL 13 /* BIOS interface (32bit Code) */ #define GBIOSCODE16_SEL 14 /* BIOS interface (16bit Code) */ #define GBIOSDATA_SEL 15 /* BIOS interface (Data) */ #define GBIOSUTIL_SEL 16 /* BIOS interface (Utility) */ #define GBIOSARGS_SEL 17 /* BIOS interface (Arguments) */ #define GNDIS_SEL 18 /* For the NDIS layer */ #define NGDT 19 .... Note that those #defines are not selectors themselves, but just a field INDEX of a selector, so they are exactly the indices of the GDT. for example, an actual selector for the kernel code (GCODE_SEL) has the value 0x20. The next step is to initialize the Interrupt Descriptor Table (IDT). This table is referenced by the processor when a software or hardware interrupt occurs. For example, to make a system call, user application issues the `INT 0x80` instruction. This is a software interrupt, so the processor's hardware looks up a record with index 0x80 in the IDT. This record points to the routine that handles this interrupt, in this particular case, this will be the kernel's syscall gate. The IDT may have a maximum of 256 (0x100) records. The kernel allocates NIDT records for the IDT, where NIDT is the maximum (256): [.programlisting] .... sys/i386/i386/machdep.c: static struct gate_descriptor idt0[NIDT]; struct gate_descriptor *idt = &idt0[0]; /* interrupt descriptor table */ .... For each interrupt, an appropriate handler is set. The syscall gate for `INT 0x80` is set as well: [.programlisting] .... sys/i386/i386/machdep.c: setidt(IDT_SYSCALL, &IDTVEC(int0x80_syscall), SDT_SYS386IGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL)); .... So when a userland application issues the `INT 0x80` instruction, control will transfer to the function `_Xint0x80_syscall`, which is in the kernel code segment and will be executed with supervisor privileges. Console and DDB are then initialized: [.programlisting] .... sys/i386/i386/machdep.c: cninit(); /* skipped */ kdb_init(); #ifdef KDB if (boothowto & RB_KDB) kdb_enter(KDB_WHY_BOOTFLAGS, "Boot flags requested debugger"); #endif .... The Task State Segment is another x86 protected mode structure, the TSS is used by the hardware to store task information when a task switch occurs. The Local Descriptors Table is used to reference userland code and data. Several selectors are defined to point to the LDT, they are the system call gates and the user code and data selectors: [.programlisting] .... sys/x86/include/segments.h: #define LSYS5CALLS_SEL 0 /* forced by intel BCS */ #define LSYS5SIGR_SEL 1 #define LUCODE_SEL 3 #define LUDATA_SEL 5 #define NLDT (LUDATA_SEL + 1) .... Next, proc0's Process Control Block (`struct pcb`) structure is initialized. proc0 is a `struct proc` structure that describes a kernel process. It is always present while the kernel is running, therefore it is linked with thread0: [.programlisting] .... sys/i386/i386/machdep.c: register_t init386(int first) { /* ... skipped ... */ proc_linkup0(&proc0, &thread0); /* ... skipped ... */ } .... The structure `struct pcb` is a part of a proc structure. It is defined in [.filename]#/usr/include/machine/pcb.h# and has a process's information specific to the i386 architecture, such as registers values. === `mi_startup()` This function performs a bubble sort of all the system initialization objects and then calls the entry of each object one by one: [.programlisting] .... sys/kern/init_main.c: for (sipp = sysinit; sipp < sysinit_end; sipp++) { /* ... skipped ... */ /* Call function */ (*((*sipp)->func))((*sipp)->udata); /* ... skipped ... */ } .... Although the sysinit framework is described in the extref:{developers-handbook}[Developers' Handbook], I will discuss the internals of it. Every system initialization object (sysinit object) is created by calling a SYSINIT() macro. Let us take as example an `announce` sysinit object. This object prints the copyright message: [.programlisting] .... sys/kern/init_main.c: static void print_caddr_t(void *data __unused) { printf("%s", (char *)data); } /* ... skipped ... */ SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyright); .... The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001). So, the copyright message will be printed out first, just after the console initialization. Let us take a look at what exactly the macro `SYSINIT()` does. It expands to a `C_SYSINIT()` macro. The `C_SYSINIT()` macro then expands to a static `struct sysinit` structure declaration with another `DATA_SET` macro call: [.programlisting] .... /usr/include/sys/kernel.h: #define C_SYSINIT(uniquifier, subsystem, order, func, ident) \ static struct sysinit uniquifier ## _sys_init = { \ subsystem, \ order, \ func, \ (ident) \ }; \ DATA_WSET(sysinit_set,uniquifier ## _sys_init); #define SYSINIT(uniquifier, subsystem, order, func, ident) \ C_SYSINIT(uniquifier, subsystem, order, \ (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)(ident)) .... The `DATA_SET()` macro expands to a `_MAKE_SET()`, and that macro is the point where all the sysinit magic is hidden: [.programlisting] .... /usr/include/linker_set.h: #define TEXT_SET(set, sym) _MAKE_SET(set, sym) #define DATA_SET(set, sym) _MAKE_SET(set, sym) .... After executing these macros, various sections were made in the kernel, including`set.sysinit_set`. Running objdump on a kernel binary, you may notice the presence of such small sections: [source,bash] .... % llvm-objdump -h /kernel Sections: Idx Name Size VMA Type 10 set_sysctl_set 000021d4 01827078 DATA 16 set_kbddriver_set 00000010 0182a4d0 DATA 20 set_scterm_set 0000000c 0182c75c DATA 21 set_cons_set 00000014 0182c768 DATA 33 set_scrndr_set 00000024 0182c828 DATA 41 set_sysinit_set 000014d8 018fabb0 DATA .... This screen dump shows that the size of set.sysinit_set section is 0x14d8 bytes, so `0x14d8/sizeof(void *)` sysinit objects are compiled into the kernel. The other sections such as `set.sysctl_set` represent other linker sets. By defining a variable of type `struct sysinit` the content of `set.sysinit_set` section will be "collected" into that variable: [.programlisting] .... sys/kern/init_main.c: SET_DECLARE(sysinit_set, struct sysinit); .... The `struct sysinit` is defined as follows: [.programlisting] .... sys/sys/kernel.h: struct sysinit { enum sysinit_sub_id subsystem; /* subsystem identifier*/ enum sysinit_elem_order order; /* init order within subsystem*/ sysinit_cfunc_t func; /* function */ const void *udata; /* multiplexer/argument */ }; .... Returning to the `mi_startup()` discussion, it is must be clear now, how the sysinit objects are being organized. The `mi_startup()` function sorts them and calls each. The very last object is the system scheduler: [.programlisting] .... /usr/include/sys/kernel.h: enum sysinit_sub_id { SI_SUB_DUMMY = 0x0000000, /* not executed; for linker*/ SI_SUB_DONE = 0x0000001, /* processed*/ SI_SUB_TUNABLES = 0x0700000, /* establish tunable values */ SI_SUB_COPYRIGHT = 0x0800001, /* first use of console*/ ... SI_SUB_LAST = 0xfffffff /* final initialization */ }; .... The system scheduler sysinit object is defined in the file [.filename]#sys/vm/vm_glue.c#, and the entry point for that object is `scheduler()`. That function is actually an infinite loop, and it represents a process with PID 0, the swapper process. The thread0 structure, mentioned before, is used to describe it. The first user process, called _init_, is created by the sysinit object `init`: [.programlisting] .... sys/kern/init_main.c: static void create_init(const void *udata __unused) { struct fork_req fr; struct ucred *newcred, *oldcred; struct thread *td; int error; bzero(&fr, sizeof(fr)); fr.fr_flags = RFFDG | RFPROC | RFSTOPPED; fr.fr_procp = &initproc; error = fork1(&thread0, &fr); if (error) panic("cannot fork init: %d\n", error); KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); /* divorce init's credentials from the kernel's */ newcred = crget(); sx_xlock(&proctree_lock); PROC_LOCK(initproc); initproc->p_flag |= P_SYSTEM | P_INMEM; initproc->p_treeflag |= P_TREE_REAPER; oldcred = initproc->p_ucred; crcopy(newcred, oldcred); #ifdef MAC mac_cred_create_init(newcred); #endif #ifdef AUDIT audit_cred_proc1(newcred); #endif proc_set_cred(initproc, newcred); td = FIRST_THREAD_IN_PROC(initproc); crcowfree(td); td->td_realucred = crcowget(initproc->p_ucred); td->td_ucred = td->td_realucred; PROC_UNLOCK(initproc); sx_xunlock(&proctree_lock); crfree(oldcred); cpu_fork_kthread_handler(FIRST_THREAD_IN_PROC(initproc), start_init, NULL); } SYSINIT(init, SI_SUB_CREATE_INIT, SI_ORDER_FIRST, create_init, NULL); .... The function `create_init()` allocates a new process by calling `fork1()`, but does not mark it runnable. When this new process is scheduled for execution by the scheduler, the `start_init()` will be called. That function is defined in [.filename]#init_main.c#. It tries to load and exec the [.filename]#init# binary, probing [.filename]#/sbin/init# first, then [.filename]#/sbin/oinit#, [.filename]#/sbin/init.bak#, and finally [.filename]#/rescue/init#: [.programlisting] .... sys/kern/init_main.c: static char init_path[MAXPATHLEN] = #ifdef INIT_PATH __XSTRING(INIT_PATH); #else "/sbin/init:/sbin/oinit:/sbin/init.bak:/rescue/init"; #endif .... diff --git a/documentation/content/en/books/arch-handbook/sound/_index.adoc b/documentation/content/en/books/arch-handbook/sound/_index.adoc index 395cade887..f7a7d6c19f 100644 --- a/documentation/content/en/books/arch-handbook/sound/_index.adoc +++ b/documentation/content/en/books/arch-handbook/sound/_index.adoc @@ -1,358 +1,358 @@ --- title: Chapter 15. Sound Subsystem prev: books/arch-handbook/newbus next: books/arch-handbook/pccard description: FreeBSD Sound Subsystem tags: ["Sound", "OSS", "pcm", "mixer"] showBookMenu: true weight: 17 params: path: "/books/arch-handbook/sound/" --- [[oss]] = Sound Subsystem :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 15 :partnums: :source-highlighter: rouge :experimental: :images-path: books/arch-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[oss-intro]] == Introduction The FreeBSD sound subsystem cleanly separates generic sound handling issues from device-specific ones. This makes it easier to add support for new hardware. The man:pcm[4] framework is the central piece of the sound subsystem. It mainly implements the following elements: * A system call interface (read, write, ioctls) to digitized sound and mixer functions. The ioctl command set is compatible with the legacy _OSS_ or _Voxware_ interface, allowing common multimedia applications to be ported without modification. * Common code for processing sound data (format conversions, virtual channels). * A uniform software interface to hardware-specific audio interface modules. * Additional support for some common hardware interfaces (ac97), or shared hardware-specific code (ex: ISA DMA routines). The support for specific sound cards is implemented by hardware-specific drivers, which provide channel and mixer interfaces to plug into the generic [.filename]#pcm# code. In this chapter, the term [.filename]#pcm# will refer to the central, common part of the sound driver, as opposed to the hardware-specific modules. The prospective driver writer will of course want to start from an existing module and use the code as the ultimate reference. But, while the sound code is nice and clean, it is also mostly devoid of comments. This document tries to give an overview of the framework interface and answer some questions that may arise while adapting the existing code. As an alternative, or in addition to starting from a working example, you can find a commented driver template at https://people.FreeBSD.org/~cg/template.c[ https://people.FreeBSD.org/~cg/template.c] [[oss-files]] == Files All the relevant code lives in [.filename]#/usr/src/sys/dev/sound/#, except for the public ioctl interface definitions, found in [.filename]#/usr/src/sys/sys/soundcard.h# Under [.filename]#/usr/src/sys/dev/sound/#, the [.filename]#pcm/# directory holds the central code, while the [.filename]#pci/#, [.filename]#isa/# and [.filename]#usb/# directories have the drivers for PCI and ISA boards, and for USB audio devices. [[pcm-probe-and-attach]] == Probing, Attaching, etc. Sound drivers probe and attach in almost the same way as any hardware driver module. You might want to look at the crossref:isa-driver[isa-driver,ISA] or crossref:pci[pci,PCI] specific sections of the handbook for more information. However, sound drivers differ in some ways: * They declare themselves as [.filename]#pcm# class devices, with a `struct snddev_info` device private structure: + [.programlisting] .... static driver_t xxx_driver = { "pcm", xxx_methods, sizeof(struct snddev_info) }; DRIVER_MODULE(snd_xxxpci, pci, xxx_driver, pcm_devclass, 0, 0); MODULE_DEPEND(snd_xxxpci, snd_pcm, PCM_MINVER, PCM_PREFVER,PCM_MAXVER); .... + Most sound drivers need to store additional private information about their device. A private data structure is usually allocated in the attach routine. Its address is passed to [.filename]#pcm# by the calls to `pcm_register()` and `mixer_init()`. [.filename]#pcm# later passes back this address as a parameter in calls to the sound driver interfaces. * The sound driver attach routine should declare its MIXER or AC97 interface to [.filename]#pcm# by calling `mixer_init()`. For a MIXER interface, this causes in turn a call to crossref:sound[xxxmixer-init,`xxxmixer_init()`]. * The sound driver attach routine declares its general CHANNEL configuration to [.filename]#pcm# by calling `pcm_register(dev, sc, nplay, nrec)`, where `sc` is the address for the device data structure, used in further calls from [.filename]#pcm#, and `nplay` and `nrec` are the number of play and record channels. * The sound driver attach routine declares each of its channel objects by calls to `pcm_addchan()`. This sets up the channel glue in [.filename]#pcm# and causes in turn a call to crossref:sound[xxxchannel-init,`xxxchannel_init()`]. * The sound driver detach routine should call `pcm_unregister()` before releasing its resources. There are two possible methods to handle non-PnP devices: * Use a `device_identify()` method (example: [.filename]#sound/isa/es1888.c#). The `device_identify()` method probes for the hardware at known addresses and, if it finds a supported device, creates a new pcm device which is then passed to probe/attach. * Use a custom kernel configuration with appropriate hints for pcm devices (example: [.filename]#sound/isa/mss.c#). [.filename]#pcm# drivers should implement `device_suspend`, `device_resume` and `device_shutdown` routines, so that power management and module unloading function correctly. [[oss-interfaces]] == Interfaces The interface between the [.filename]#pcm# core and the sound drivers is defined in terms of crossref:kobj[kernel-objects,kernel objects]. There are two main interfaces that a sound driver will usually provide: _CHANNEL_ and either _MIXER_ or _AC97_. The _AC97_ interface is a very small hardware access (register read/write) interface, implemented by drivers for hardware with an AC97 codec. In this case, the actual MIXER interface is provided by the shared AC97 code in [.filename]#pcm#. === The CHANNEL Interface ==== Common Notes for Function Parameters Sound drivers usually have a private data structure to describe their device, and one structure for each play and record data channel that it supports. For all CHANNEL interface functions, the first parameter is an opaque pointer. The second parameter is a pointer to the private channel data structure, except for `channel_init()` which has a pointer to the private device structure (and returns the channel pointer for further use by [.filename]#pcm#). ==== Overview of Data Transfer Operations For sound data transfers, the [.filename]#pcm# core and the sound drivers communicate through a shared memory area, described by a `struct snd_dbuf`. `struct snd_dbuf` is private to [.filename]#pcm#, and sound drivers obtain values of interest by calls to accessor functions (`sndbuf_getxxx()`). The shared memory area has a size of `sndbuf_getsize()` and is divided into fixed size blocks of `sndbuf_getblksz()` bytes. When playing, the general transfer mechanism is as follows (reverse the idea for recording): * [.filename]#pcm# initially fills up the buffer, then calls the sound driver's crossref:sound[channel-trigger,`xxxchannel_trigger()`] function with a parameter of PCMTRIG_START. * The sound driver then arranges to repeatedly transfer the whole memory area (`sndbuf_getbuf()`, `sndbuf_getsize()`) to the device, in blocks of `sndbuf_getblksz()` bytes. It calls back the `chn_intr()`[.filename]#pcm# function for each transferred block (this will typically happen at interrupt time). * `chn_intr()` arranges to copy new data to the area that was transferred to the device (now free), and make appropriate updates to the `snd_dbuf` structure. [[xxxchannel-init]] ==== channel_init `xxxchannel_init()` is called to initialize each of the play or record channels. -The calls are initiated from the sound driver attach routine. (See the crossref:sound[pcm-probe-and-attach,probe and attach section). +The calls are initiated from the sound driver attach routine. (See the crossref:sound[pcm-probe-and-attach,probe and attach section]). [.programlisting] .... static void * xxxchannel_init(kobj_t obj, void *data, struct snd_dbuf *b, struct pcm_channel *c, int dir) <.> { struct xxx_info *sc = data; struct xxx_chinfo *ch; ... return ch; <.> } .... <.> `b` is the address for the channel `struct snd_dbuf`. It should be initialized in the function by calling `sndbuf_alloc()`. The buffer size to use is normally a small multiple of the 'typical' unit transfer size for your device.`c` is the [.filename]#pcm# channel control structure pointer. This is an opaque object. The function should store it in the local channel structure, to be used in later calls to [.filename]#pcm# (ie: `chn_intr(c)`).`dir` indicates the channel direction (`PCMDIR_PLAY` or `PCMDIR_REC`). <.> The function should return a pointer to the private area used to control this channel. This will be passed as a parameter to other channel interface calls. ==== channel_setformat `xxxchannel_setformat()` should set up the hardware for the specified channel for the specified sound format. [.programlisting] .... static int xxxchannel_setformat(kobj_t obj, void *data, u_int32_t format) <.> { struct xxx_chinfo *ch = data; ... return 0; } .... <.> `format` is specified as an `AFMT_XXX value` ([.filename]#soundcard.h#). ==== channel_setspeed `xxxchannel_setspeed()` sets up the channel hardware for the specified sampling speed, and returns the possibly adjusted speed. [.programlisting] .... static int xxxchannel_setspeed(kobj_t obj, void *data, u_int32_t speed) { struct xxx_chinfo *ch = data; ... return speed; } .... ==== channel_setblocksize `xxxchannel_setblocksize()` sets the block size, which is the size of unit transactions between [.filename]#pcm# and the sound driver, and between the sound driver and the device. Typically, this would be the number of bytes transferred before an interrupt occurs. During a transfer, the sound driver should call [.filename]#pcm#'s `chn_intr()` every time this size has been transferred. Most sound drivers only take note of the block size here, to be used when an actual transfer will be started. [.programlisting] .... static int xxxchannel_setblocksize(kobj_t obj, void *data, u_int32_t blocksize) { struct xxx_chinfo *ch = data; ... return blocksize; <.> } .... <.> The function returns the possibly adjusted block size. In case the block size is indeed changed, `sndbuf_resize()` should be called to adjust the buffer. [[channel-trigger]] ==== channel_trigger `xxxchannel_trigger()` is called by [.filename]#pcm# to control data transfer operations in the driver. [.programlisting] .... static int xxxchannel_trigger(kobj_t obj, void *data, int go) <.> { struct xxx_chinfo *ch = data; ... return 0; } .... <.> `go` defines the action for the current call. The possible values are: [NOTE] ==== If the driver uses ISA DMA, `sndbuf_isadma()` should be called before performing actions on the device, and will take care of the DMA chip side of things. ==== ==== channel_getptr `xxxchannel_getptr()` returns the current offset in the transfer buffer. This will typically be called by `chn_intr()`, and this is how [.filename]#pcm# knows where it can transfer new data. ==== channel_free `xxxchannel_free()` is called to free up channel resources, for example when the driver is unloaded, and should be implemented if the channel data structures are dynamically allocated or if `sndbuf_alloc()` was not used for buffer allocation. ==== channel_getcaps [.programlisting] .... struct pcmchan_caps * xxxchannel_getcaps(kobj_t obj, void *data) { return &xxx_caps; <.> } .... <.> The routine returns a pointer to a (usually statically-defined) `pcmchan_caps` structure (defined in [.filename]#sound/pcm/channel.h#. The structure holds the minimum and maximum sampling frequencies, and the accepted sound formats. Look at any sound driver for an example. ==== More Functions `channel_reset()`, `channel_resetdone()`, and `channel_notify()` are for special purposes and should not be implemented in a driver without discussing it on the {freebsd-multimedia}. `channel_setdir()` is deprecated. === The MIXER Interface [[xxxmixer-init]] ==== mixer_init `xxxmixer_init()` initializes the hardware and tells [.filename]#pcm# what mixer devices are available for playing and recording [.programlisting] .... static int xxxmixer_init(struct snd_mixer *m) { struct xxx_info *sc = mix_getdevinfo(m); u_int32_t v; [Initialize hardware] [Set appropriate bits in v for play mixers] <.> mix_setdevs(m, v); [Set appropriate bits in v for record mixers] mix_setrecdevs(m, v) return 0; } .... <.> Set bits in an integer value and call `mix_setdevs()` and `mix_setrecdevs()` to tell [.filename]#pcm# what devices exist. Mixer bits definitions can be found in [.filename]#soundcard.h# (`SOUND_MASK_XXX` values and `SOUND_MIXER_XXX` bit shifts). ==== mixer_set `xxxmixer_set()` sets the volume level for one mixer device. [.programlisting] .... static int xxxmixer_set(struct snd_mixer *m, unsigned dev, unsigned left, unsigned right) <.> { struct sc_info *sc = mix_getdevinfo(m); [set volume level] return left | (right << 8); <.> } .... <.> The device is specified as a `SOUND_MIXER_XXX` value. The volume values are specified in range [0-100]. A value of zero should mute the device. <.> As the hardware levels probably will not match the input scale, and some rounding will occur, the routine returns the actual level values (in range 0-100) as shown. ==== mixer_setrecsrc `xxxmixer_setrecsrc()` sets the recording source device. [.programlisting] .... static int xxxmixer_setrecsrc(struct snd_mixer *m, u_int32_t src) <.> { struct xxx_info *sc = mix_getdevinfo(m); [look for non zero bit(s) in src, set up hardware] [update src to reflect actual action] return src; <.> } .... <.> The desired recording devices are specified as a bit field <.> The actual devices set for recording are returned. Some drivers can only set one device for recording. The function should return -1 if an error occurs. ==== mixer_uninit, mixer_reinit `xxxmixer_uninit()` should ensure that all sound is muted and if possible mixer hardware should be powered down. `xxxmixer_reinit()` should ensure that the mixer hardware is powered up and any settings not controlled by `mixer_set()` or `mixer_setrecsrc()` are restored. === The AC97 Interface The _AC97_ interface is implemented by drivers with an AC97 codec. It only has three methods: * `xxxac97_init()` returns the number of ac97 codecs found. * `ac97_read()` and `ac97_write()` read or write a specified register. The _AC97_ interface is used by the AC97 code in [.filename]#pcm# to perform higher level operations. Look at [.filename]#sound/pci/maestro3.c# or many others under [.filename]#sound/pci/# for an example. diff --git a/documentation/content/en/books/arch-handbook/usb/_index.adoc b/documentation/content/en/books/arch-handbook/usb/_index.adoc index 94a22e850b..a1c8f2b579 100644 --- a/documentation/content/en/books/arch-handbook/usb/_index.adoc +++ b/documentation/content/en/books/arch-handbook/usb/_index.adoc @@ -1,186 +1,186 @@ --- title: Chapter 13. USB Devices prev: books/arch-handbook/scsi next: books/arch-handbook/newbus description: USB Devices in FreeBSD tags: ["USB", "Structure", "UHCI", "OHCI"] showBookMenu: true weight: 15 params: path: "/books/arch-handbook/usb/" --- [[usb]] = USB Devices :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 13 :partnums: :source-highlighter: rouge :experimental: :images-path: books/arch-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[usb-intro]] == Introduction The Universal Serial Bus (USB) is a new way of attaching devices to personal computers. The bus architecture features two-way communication and has been developed as a response to devices becoming smarter and requiring more interaction with the host. USB support is included in all current PC chipsets and is therefore available in all recently built PCs. Apple's introduction of the USB-only iMac has been a major incentive for hardware manufacturers to produce USB versions of their devices. The future PC specifications specify that all legacy connectors on PCs should be replaced by one or more USB connectors, providing generic plug and play capabilities. Support for USB hardware was available at a very early stage in NetBSD and was developed by Lennart Augustsson for the NetBSD project. The code has been ported to FreeBSD and we are currently maintaining a shared code base. For the implementation of the USB subsystem a number of features of USB are important. _Lennart Augustsson has done most of the implementation of the USB support for the NetBSD project. Many thanks for this incredible amount of work. Many thanks also to Ardy and Dirk for their comments and proofreading of this paper._ * Devices connect to ports on the computer directly or on devices called hubs, forming a treelike device structure. * The devices can be connected and disconnected at run time. * Devices can suspend themselves and trigger resumes of the host system * As the devices can be powered from the bus, the host software has to keep track of power budgets for each hub. * Different quality of service requirements by the different device types together with the maximum of 126 devices that can be connected to the same bus, require proper scheduling of transfers on the shared bus to take full advantage of the 12Mbps bandwidth available. (over 400Mbps with USB 2.0) * Devices are intelligent and contain easily accessible information about themselves The development of drivers for the USB subsystem and devices connected to it is supported by the specifications that have been developed and will be developed. These specifications are publicly available from the USB home pages. Apple has been very strong in pushing for standards based drivers, by making drivers for the generic classes available in their operating system MacOS and discouraging the use of separate drivers for each new device. This chapter tries to collate essential information for a basic understanding of the USB 2.0 implementation stack in FreeBSD/NetBSD. It is recommended however to read it together with the relevant 2.0 specifications and other developer resources: * USB 2.0 Specification (http://www.usb.org/developers/docs/usb20_docs/[http://www.usb.org/developers/docs/usb20_docs/]) -* Universal Host Controller Interface (UHCI) Specification (link:ftp://ftp.netbsd.org/pub/NetBSD/misc/blymn/uhci11d.pdf[ftp://ftp.netbsd.org/pub/NetBSD/misc/blymn/uhci11d.pdf)] +* Universal Host Controller Interface (UHCI) Specification (link:ftp://ftp.netbsd.org/pub/NetBSD/misc/blymn/uhci11d.pdf[ftp://ftp.netbsd.org/pub/NetBSD/misc/blymn/uhci11d.pdf]) * Open Host Controller Interface (OHCI) Specification(link:ftp://ftp.compaq.com/pub/supportinformation/papers/hcir1_0a.pdf[ftp://ftp.compaq.com/pub/supportinformation/papers/hcir1_0a.pdf]) * Developer section of USB home page (http://www.usb.org/developers/[http://www.usb.org/developers/]) === Structure of the USB Stack The USB support in FreeBSD can be split into three layers. The lowest layer contains the host controller driver, providing a generic interface to the hardware and its scheduling facilities. It supports initialisation of the hardware, scheduling of transfers and handling of completed and/or failed transfers. Each host controller driver implements a virtual hub providing hardware independent access to the registers controlling the root ports on the back of the machine. The middle layer handles the device connection and disconnection, basic initialisation of the device, driver selection, the communication channels (pipes) and does resource management. This services layer also controls the default pipes and the device requests transferred over them. The top layer contains the individual drivers supporting specific (classes of) devices. These drivers implement the protocol that is used over the pipes other than the default pipe. They also implement additional functionality to make the device available to other parts of the kernel or userland. They use the USB driver interface (USBDI) exposed by the services layer. [[usb-hc]] == Host Controllers The host controller (HC) controls the transmission of packets on the bus. Frames of 1 millisecond are used. At the start of each frame the host controller generates a Start of Frame (SOF) packet. The SOF packet is used to synchronise to the start of the frame and to keep track of the frame number. Within each frame packets are transferred, either from host to device (out) or from device to host (in). Transfers are always initiated by the host (polled transfers). Therefore there can only be one host per USB bus. Each transfer of a packet has a status stage in which the recipient of the data can return either ACK (acknowledge reception), NAK (retry), STALL (error condition) or nothing (garbled data stage, device not available or disconnected). Section 8.5 of the USB 2.0 Specification explains the details of packets in more detail. Four different types of transfers can occur on a USB bus: control, bulk, interrupt and isochronous. The types of transfers and their characteristics are described below. Large transfers between the device on the USB bus and the device driver are split up into multiple packets by the host controller or the HC driver. Device requests (control transfers) to the default endpoints are special. They consist of two or three phases: SETUP, DATA (optional) and STATUS. The set-up packet is sent to the device. If there is a data phase, the direction of the data packet(s) is given in the set-up packet. The direction in the status phase is the opposite of the direction during the data phase, or IN if there was no data phase. The host controller hardware also provides registers with the current status of the root ports and the changes that have occurred since the last reset of the status change register. Access to these registers is provided through a virtualised hub as suggested in the USB specification. The virtual hub must comply with the hub device class given in chapter 11 of that specification. It must provide a default pipe through which device requests can be sent to it. It returns the standard andhub class specific set of descriptors. It should also provide an interrupt pipe that reports changes happening at its ports. There are currently two specifications for host controllers available: Universal Host Controller Interface (UHCI) from Intel and Open Host Controller Interface (OHCI) from Compaq, Microsoft, and National Semiconductor. The UHCI specification has been designed to reduce hardware complexity by requiring the host controller driver to supply a complete schedule of the transfers for each frame. OHCI type controllers are much more independent by providing a more abstract interface doing a lot of work themselves. === UHCI The UHCI host controller maintains a framelist with 1024 pointers to per frame data structures. It understands two different data types: transfer descriptors (TD) and queue heads (QH). Each TD represents a packet to be communicated to or from a device endpoint. QHs are a means to groupTDs (and QHs) together. Each transfer consists of one or more packets. The UHCI driver splits large transfers into multiple packets. For every transfer, apart from isochronous transfers, a QH is allocated. For every type of transfer these QHs are collected at a QH for that type. Isochronous transfers have to be executed first because of the fixed latency requirement and are directly referred to by the pointer in the framelist. The last isochronous TD refers to the QH for interrupt transfers for that frame. All QHs for interrupt transfers point at the QH for control transfers, which in turn points at the QH for bulk transfers. The following diagram gives a graphical overview of this: This results in the following schedule being run in each frame. After fetching the pointer for the current frame from the framelist the controller first executes the TDs for all the isochronous packets in that frame. The last of these TDs refers to the QH for the interrupt transfers for that frame. The host controller will then descend from that QH to the QHs for the individual interrupt transfers. After finishing that queue, the QH for the interrupt transfers will refer the controller to the QH for all control transfers. It will execute all the subqueues scheduled there, followed by all the transfers queued at the bulk QH. To facilitate the handling of finished or failed transfers different types of interrupts are generated by the hardware at the end of each frame. In the last TD for a transfer the Interrupt-On Completion bit is set by the HC driver to flag an interrupt when the transfer has completed. An error interrupt is flagged if a TD reaches its maximum error count. If the short packet detect bit is set in a TD and less than the set packet length is transferred this interrupt is flagged to notify the controller driver of the completed transfer. It is the host controller driver's task to find out which transfer has completed or produced an error. When called the interrupt service routine will locate all the finished transfers and call their callbacks. Refer to the UHCI Specification for a more elaborate description. === OHCI Programming an OHCI host controller is much simpler. The controller assumes that a set of endpoints is available, and is aware of scheduling priorities and the ordering of the types of transfers in a frame. The main data structure used by the host controller is the endpoint descriptor (ED) to which a queue of transfer descriptors (TDs) is attached. The ED contains the maximum packet size allowed for an endpoint and the controller hardware does the splitting into packets. The pointers to the data buffers are updated after each transfer and when the start and end pointer are equal, the TD is retired to the done-queue. The four types of endpoints (interrupt, isochronous, control, and bulk) have their own queues. Control and bulk endpoints are queued each at their own queue. Interrupt EDs are queued in a tree, with the level in the tree defining the frequency at which they run. The schedule being run by the host controller in each frame looks as follows. The controller will first run the non-periodic control and bulk queues, up to a time limit set by the HC driver. Then the interrupt transfers for that frame number are run, by using the lower five bits of the frame number as an index into level 0 of the tree of interrupts EDs. At the end of this tree the isochronous EDs are connected and these are traversed subsequently. The isochronous TDs contain the frame number of the first frame the transfer should be run in. After all the periodic transfers have been run, the control and bulk queues are traversed again. Periodically the interrupt service routine is called to process the done queue and call the callbacks for each transfer and reschedule interrupt and isochronous endpoints. See the UHCI Specification for a more elaborate description. The middle layer provides access to the device in a controlled way and maintains resources in use by the different drivers and the services layer. The layer takes care of the following aspects: * The device configuration information * The pipes to communicate with a device * Probing and attaching and detaching form a device. [[usb-dev]] == USB Device Information === Device Configuration Information Each device provides different levels of configuration information. Each device has one or more configurations, of which one is selected during probe/attach. A configuration provides power and bandwidth requirements. Within each configuration there can be multiple interfaces. A device interface is a collection of endpoints. For example USB speakers can have an interface for the audio data (Audio Class) and an interface for the knobs, dials and buttons (HID Class). All interfaces in a configuration are active at the same time and can be attached to by different drivers. Each interface can have alternates, providing different quality of service parameters. In for example cameras this is used to provide different frame sizes and numbers of frames per second. Within each interface, 0 or more endpoints can be specified. Endpoints are the unidirectional access points for communicating with a device. They provide buffers to temporarily store incoming or outgoing data from the device. Each endpoint has a unique address within a configuration, the endpoint's number plus its direction. The default endpoint, endpoint 0, is not part of any interface and available in all configurations. It is managed by the services layer and not directly available to device drivers. This hierarchical configuration information is described in the device by a standard set of descriptors (see section 9.6 of the USB specification). They can be requested through the Get Descriptor Request. The services layer caches these descriptors to avoid unnecessary transfers on the USB bus. Access to the descriptors is provided through function calls. * Device descriptors: General information about the device, like Vendor, Product and Revision Id, supported device class, subclass and protocol if applicable, maximum packet size for the default endpoint, etc. * Configuration descriptors: The number of interfaces in this configuration, suspend and resume functionality supported and power requirements. * Interface descriptors: interface class, subclass and protocol if applicable, number of alternate settings for the interface and the number of endpoints. * Endpoint descriptors: Endpoint address, direction and type, maximum packet size supported and polling frequency if type is interrupt endpoint. There is no descriptor for the default endpoint (endpoint 0) and it is never counted in an interface descriptor. * String descriptors: In the other descriptors string indices are supplied for some fields.These can be used to retrieve descriptive strings, possibly in multiple languages. Class specifications can add their own descriptor types that are available through the GetDescriptor Request. Pipes Communication to end points on a device flows through so-called pipes. Drivers submit transfers to endpoints to a pipe and provide a callback to be called on completion or failure of the transfer (asynchronous transfers) or wait for completion (synchronous transfer). Transfers to an endpoint are serialised in the pipe. A transfer can either complete, fail or time-out (if a time-out has been set). There are two types of time-outs for transfers. Time-outs can happen due to time-out on the USBbus (milliseconds). These time-outs are seen as failures and can be due to disconnection of the device. A second form of time-out is implemented in software and is triggered when a transfer does not complete within a specified amount of time (seconds). These are caused by a device acknowledging negatively (NAK) the transferred packets. The cause for this is the device not being ready to receive data, buffer under- or overrun or protocol errors. If a transfer over a pipe is larger than the maximum packet size specified in the associated endpoint descriptor, the host controller (OHCI) or the HC driver (UHCI) will split the transfer into packets of maximum packet size, with the last packet possibly smaller than the maximum packet size. Sometimes it is not a problem for a device to return less data than requested. For example abulk-in-transfer to a modem might request 200 bytes of data, but the modem has only 5 bytes available at that time. The driver can set the short packet (SPD) flag. It allows the host controller to accept a packet even if the amount of data transferred is less than requested. This flag is only valid for in-transfers, as the amount of data to be sent to a device is always known beforehand. If an unrecoverable error occurs in a device during a transfer the pipe is stalled. Before any more data is accepted or sent the driver needs to resolve the cause of the stall and clear the endpoint stall condition through send the clear endpoint halt device request over the default pipe. The default endpoint should never stall. There are four different types of endpoints and corresponding pipes: - Control pipe / default pipe: There is one control pipe per device, connected to the default endpoint (endpoint 0). The pipe carries the device requests and associated data. The difference between transfers over the default pipe and other pipes is that the protocol for the transfers is described in the USB specification. These requests are used to reset and configure the device. A basic set of commands that must be supported by each device is provided in chapter 9 of the USB specification. The commands supported on this pipe can be extended by a device class specification to support additional functionality. * Bulk pipe: This is the USB equivalent to a raw transmission medium. * Interrupt pipe: The host sends a request for data to the device and if the device has nothing to send, it will NAK the data packet. Interrupt transfers are scheduled at a frequency specified when creating the pipe. * Isochronous pipe: These pipes are intended for isochronous data, for example video or audio streams, with fixed latency, but no guaranteed delivery. Some support for pipes of this type is available in the current implementation. Packets in control, bulk and interrupt transfers are retried if an error occurs during transmission or the device acknowledges the packet negatively (NAK) due to for example lack of buffer space to store the incoming data. Isochronous packets are however not retried in case of failed delivery or NAK of a packet as this might violate the timing constraints. The availability of the necessary bandwidth is calculated during the creation of the pipe. Transfers are scheduled within frames of 1 millisecond. The bandwidth allocation within a frame is prescribed by the USB specification, section 5.6 [ 2]. Isochronous and interrupt transfers are allowed to consume up to 90% of the bandwidth within a frame. Packets for control and bulk transfers are scheduled after all isochronous and interrupt packets and will consume all the remaining bandwidth. More information on scheduling of transfers and bandwidth reclamation can be found in chapter 5 of the USB specification, section 1.3 of the UHCI specification, and section 3.4.2 of the OHCI specification. [[usb-devprobe]] == Device Probe and Attach After the notification by the hub that a new device has been connected, the service layer switches on the port, providing the device with 100 mA of current. At this point the device is in its default state and listening to device address 0. The services layer will proceed to retrieve the various descriptors through the default pipe. After that it will send a Set Address request to move the device away from the default device address (address 0). Multiple device drivers might be able to support the device. For example a modem driver might be able to support an ISDN TA through the AT compatibility interface. A driver for that specific model of the ISDN adapter might however be able to provide much better support for this device. To support this flexibility, the probes return priorities indicating their level of support. Support for a specific revision of a product ranks the highest and the generic driver the lowest priority. It might also be that multiple drivers could attach to one device if there are multiple interfaces within one configuration. Each driver only needs to support a subset of the interfaces. The probing for a driver for a newly attached device checks first for device specific drivers. If not found, the probe code iterates over all supported configurations until a driver attaches in a configuration. To support devices with multiple drivers on different interfaces, the probe iterates over all interfaces in a configuration that have not yet been claimed by a driver. Configurations that exceed the power budget for the hub are ignored. During attach the driver should initialise the device to its proper state, but not reset it, as this will make the device disconnect itself from the bus and restart the probing process for it. To avoid consuming unnecessary bandwidth should not claim the interrupt pipe at attach time, but should postpone allocating the pipe until the file is opened and the data is actually used. When the file is closed the pipe should be closed again, even though the device might still be attached. === Device Disconnect and Detach A device driver should expect to receive errors during any transaction with the device. The design of USB supports and encourages the disconnection of devices at any point in time. Drivers should make sure that they do the right thing when the device disappears. Furthermore a device that has been disconnected and reconnected will not be reattached at the same device instance. This might change in the future when more devices support serial numbers (see the device descriptor) or other means of defining an identity for a device have been developed. The disconnection of a device is signaled by a hub in the interrupt packet delivered to the hub driver. The status change information indicates which port has seen a connection change. The device detach method for all device drivers for the device connected on that port are called and the structures cleaned up. If the port status indicates that in the mean time a device has been connected to that port, the procedure for probing and attaching the device will be started. A device reset will produce a disconnect-connect sequence on the hub and will be handled as described above. [[usb-protocol]] == USB Drivers Protocol Information The protocol used over pipes other than the default pipe is undefined by the USB specification. Information on this can be found from various sources. The most accurate source is the developer's section on the USB home pages. From these pages, a growing number of deviceclass specifications are available. These specifications specify what a compliant device should look like from a driver perspective, basic functionality it needs to provide and the protocol that is to be used over the communication channels. The USB specification includes the description of the Hub Class. A class specification for Human Interface Devices (HID) has been created to cater for keyboards, tablets, bar-code readers, buttons, knobs, switches, etc. A third example is the class specification for mass storage devices. For a full list of device classes see the developers section on the USB home pages. For many devices the protocol information has not yet been published however. Information on the protocol being used might be available from the company making the device. Some companies will require you to sign a Non -Disclosure Agreement (NDA) before giving you the specifications. This in most cases precludes making the driver open source. Another good source of information is the Linux driver sources, as a number of companies have started to provide drivers for Linux for their devices. It is always a good idea to contact the authors of those drivers for their source of information. Example: Human Interface Devices The specification for the Human Interface Devices like keyboards, mice, tablets, buttons, dials,etc. is referred to in other device class specifications and is used in many devices. For example audio speakers provide endpoints to the digital to analogue converters and possibly an extra pipe for a microphone. They also provide a HID endpoint in a separate interface for the buttons and dials on the front of the device. The same is true for the monitor control class. It is straightforward to build support for these interfaces through the available kernel and userland libraries together with the HID class driver or the generic driver. Another device that serves as an example for interfaces within one configuration driven by different device drivers is a cheap keyboard with built-in legacy mouse port. To avoid having the cost of including the hardware for a USB hub in the device, manufacturers combined the mouse data received from the PS/2 port on the back of the keyboard and the key presses from the keyboard into two separate interfaces in the same configuration. The mouse and keyboard drivers each attach to the appropriate interface and allocate the pipes to the two independent endpoints. Example: Firmware download Many devices that have been developed are based on a general purpose processor with an additional USB core added to it. Since the development of drivers and firmware for USB devices is still very new, many devices require the downloading of the firmware after they have been connected. The procedure followed is straightforward. The device identifies itself through a vendor and product Id. The first driver probes and attaches to it and downloads the firmware into it. After that the device soft resets itself and the driver is detached. After a short pause the device announces its presence on the bus. The device will have changed its vendor/product/revision Id to reflect the fact that it has been supplied with firmware and as a consequence a second driver will probe it and attach to it. An example of these types of devices is the ActiveWire I/O board, based on the EZ-USB chip. For this chip a generic firmware downloader is available. The firmware downloaded into the ActiveWire board changes the revision Id. It will then perform a soft reset of the USB part of the EZ-USB chip to disconnect from the USB bus and again reconnect. Example: Mass Storage Devices Support for mass storage devices is mainly built around existing protocols. The Iomega USB Zipdrive is based on the SCSI version of their drive. The SCSI commands and status messages are wrapped in blocks and transferred over the bulk pipes to and from the device, emulating a SCSI controller over the USB wire. ATAPI and UFI commands are supported in a similar fashion. The Mass Storage Specification supports 2 different types of wrapping of the command block.The initial attempt was based on sending the command and status through the default pipe and using bulk transfers for the data to be moved between the host and the device. Based on experience a second approach was designed that was based on wrapping the command and status blocks and sending them over the bulk out and in endpoint. The specification specifies exactly what has to happen when and what has to be done in case an error condition is encountered. The biggest challenge when writing drivers for these devices is to fit USB based protocol into the existing support for mass storage devices. CAM provides hooks to do this in a fairly straight forward way. ATAPI is less simple as historically the IDE interface has never had many different appearances. The support for the USB floppy from Y-E Data is again less straightforward as a new command set has been designed. diff --git a/documentation/content/en/books/developers-handbook/ipv6/_index.adoc b/documentation/content/en/books/developers-handbook/ipv6/_index.adoc index a97feddf59..e80cf22dbd 100644 --- a/documentation/content/en/books/developers-handbook/ipv6/_index.adoc +++ b/documentation/content/en/books/developers-handbook/ipv6/_index.adoc @@ -1,851 +1,851 @@ --- title: Chapter 8. IPv6 Internals authors: - author: Yoshinobu Inoue prev: books/developers-handbook/sockets next: books/developers-handbook/partiii description: IPv6 Internals tags: ["IPv6", "FreeBSD"] showBookMenu: true weight: 10 params: path: "/books/developers-handbook/ipv6/" --- [[ipv6]] = IPv6 Internals :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 8 :partnums: :source-highlighter: rouge :experimental: :images-path: books/developers-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[ipv6-implementation]] == IPv6/IPsec Implementation This section should explain IPv6 and IPsec related implementation internals. These functionalities are derived from http://www.kame.net/[KAME project] [[ipv6details]] === IPv6 ==== Conformance The IPv6 related functions conforms, or tries to conform to the latest set of IPv6 specifications. For future reference we list some of the relevant documents below (_NOTE_: this is not a complete list - this is too hard to maintain...). For details please refer to specific chapter in the document, RFCs, manual pages, or comments in the source code. Conformance tests have been performed on the KAME STABLE kit at TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/[http://www.tahi.org/report/KAME/]. We also attended University of New Hampshire IOL tests (http://www.iol.unh.edu/[http://www.iol.unh.edu/]) in the past, with our past snapshots. * RFC1639: FTP Operation Over Big Address Records (FOOBAR) ** RFC2428 is preferred over RFC1639. FTP clients will first try RFC2428, then RFC1639 if failed. * RFC1886: DNS Extensions to support IPv6 * RFC1933: Transition Mechanisms for IPv6 Hosts and Routers ** IPv4 compatible address is not supported. ** automatic tunneling (described in 4.3 of this RFC) is not supported. ** man:gif[4] interface implements IPv[46]-over-IPv[46] tunnel in a generic way, - and it covers "configured tunnel" described in the spec. See crossref:ipv6[gif,23.5.1.5] in this document for details. + and it covers "configured tunnel" described in the spec. See crossref:ipv6[gif,Generic Tunnel Interface] in this document for details. * RFC1981: Path MTU Discovery for IPv6 * RFC2080: RIPng for IPv6 ** usr.sbin/route6d support this. * RFC2292: Advanced Sockets API for IPv6 ** For supported library functions/kernel APIs, see [.filename]#sys/netinet6/ADVAPI#. * RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM) ** RFC2362 defines packet formats for PIM-SM. [.filename]#draft-ietf-pim-ipv6-01.txt# is written based on this. * RFC2373: IPv6 Addressing Architecture ** supports node required addresses, and conforms to the scope requirement. * RFC2374: An IPv6 Aggregatable Global Unicast Address Format ** supports 64-bit length of Interface ID. * RFC2375: IPv6 Multicast Address Assignments ** Userland applications use the well-known addresses assigned in the RFC. * RFC2428: FTP Extensions for IPv6 and NATs ** RFC2428 is preferred over RFC1639. FTP clients will first try RFC2428, then RFC1639 if failed. * RFC2460: IPv6 specification * RFC2461: Neighbor discovery for IPv6 -** See crossref:ipv6[neighbor-discovery,23.5.1.2] in this document for details. +** See crossref:ipv6[neighbor-discovery,Neighbor Discovery] in this document for details. * RFC2462: IPv6 Stateless Address Autoconfiguration -** See crossref:ipv6[ipv6-pnp,23.5.1.4] in this document for details. +** See crossref:ipv6[ipv6-pnp,Plug and Play] in this document for details. * RFC2463: ICMPv6 for IPv6 specification -** See crossref:ipv6[icmpv6,23.5.1.9] in this document for details. +** See crossref:ipv6[icmpv6,ICMPv6] in this document for details. * RFC2464: Transmission of IPv6 Packets over Ethernet Networks * RFC2465: MIB for IPv6: Textual Conventions and General Group ** Necessary statistics are gathered by the kernel. Actual IPv6 MIB support is provided as a patchkit for ucd-snmp. * RFC2466: MIB for IPv6: ICMPv6 group ** Necessary statistics are gathered by the kernel. Actual IPv6 MIB support is provided as patchkit for ucd-snmp. * RFC2467: Transmission of IPv6 Packets over FDDI Networks * RFC2497: Transmission of IPv6 packet over ARCnet Networks * RFC2553: Basic Socket Interface Extensions for IPv6 ** IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind socket - (3.8) are supported. See crossref:ipv6[ipv6-wildcard-socket,23.5.1.12] in this document for details. + (3.8) are supported. See crossref:ipv6[ipv6-wildcard-socket,IPv4 Mapped Address and IPv6 Wildcard Socket] in this document for details. * RFC2675: IPv6 Jumbograms -** See crossref:ipv6[ipv6-jumbo,23.5.1.7] in this document for details. +** See crossref:ipv6[ipv6-jumbo,Jumbo Payload] in this document for details. * RFC2710: Multicast Listener Discovery for IPv6 * RFC2711: IPv6 router alert option * [.filename]#draft-ietf-ipngwg-router-renum-08#: Router renumbering for IPv6 * [.filename]#draft-ietf-ipngwg-icmp-namelookups-02#: IPv6 Name Lookups Through ICMP * [.filename]#draft-ietf-ipngwg-icmp-name-lookups-03#: IPv6 Name Lookups Through ICMP * [.filename]#draft-ietf-pim-ipv6-01.txt#: PIM for IPv6 ** man:pim6dd[8] implements dense mode. man:pim6sd[8] implements sparse mode. * [.filename]#draft-itojun-ipv6-tcp-to-anycast-00#: Disconnecting TCP connection toward IPv6 anycast address * [.filename]#draft-yamamoto-wideipv6-comm-model-00# -** See crossref:ipv6[ipv6-sas,23.5.1.6] in this document for details. +** See crossref:ipv6[ipv6-sas,Source Address Selection] in this document for details. * [.filename]#draft-ietf-ipngwg-scopedaddr-format-00.txt#: An Extension of Format for IPv6 Scoped Addresses [[neighbor-discovery]] ==== Neighbor Discovery Neighbor Discovery is fairly stable. Currently Address Resolution, Duplicated Address Detection, and Neighbor Unreachability Detection are supported. In the near future we will be adding Proxy Neighbor Advertisement support in the kernel and Unsolicited Neighbor Advertisement transmission command as admin tool. If DAD fails, the address will be marked "duplicated" and message will be generated to syslog (and usually to console). The "duplicated" mark can be checked with man:ifconfig[8]. It is administrators' responsibility to check for and recover from DAD failures. The behavior should be improved in the near future. Some of the network driver loops multicast packets back to itself, even if instructed not to do so (especially in promiscuous mode). In such cases DAD may fail, because DAD engine sees inbound NS packet (actually from the node itself) and considers it as a sign of duplicate. You may want to look at #if condition marked "heuristics" in sys/netinet6/nd6_nbr.c:nd6_dad_timer() as workaround (note that the code fragment in "heuristics" section is not spec conformant). Neighbor Discovery specification (RFC2461) does not talk about neighbor cache handling in the following cases: . when there was no neighbor cache entry, node received unsolicited RS/NS/NA/redirect packet without link-layer address . neighbor cache handling on medium without link-layer address (we need a neighbor cache entry for IsRouter bit) For first case, we implemented workaround based on discussions on IETF ipngwg mailing list. For more details, see the comments in the source code and email thread started from (IPng 7155), dated Feb 6 1999. IPv6 on-link determination rule (RFC2461) is quite different from assumptions in BSD network code. At this moment, no on-link determination rule is supported where default router list is empty (RFC2461, section 5.2, last sentence in 2nd paragraph - note that the spec misuse the word "host" and "node" in several places in the section). To avoid possible DoS attacks and infinite loops, only 10 options on ND packet is accepted now. Therefore, if you have 20 prefix options attached to RA, only the first 10 prefixes will be recognized. If this troubles you, please ask it on FREEBSD-CURRENT mailing list and/or modify nd6_maxndopt in [.filename]#sys/netinet6/nd6.c#. If there are high demands we may provide sysctl knob for the variable. [[ipv6-scope-index]] ==== Scope Index IPv6 uses scoped addresses. Therefore, it is very important to specify scope index (interface index for link-local address, or site index for site-local address) with an IPv6 address. Without scope index, scoped IPv6 address is ambiguous to the kernel, and kernel will not be able to determine the outbound interface for a packet. Ordinary userland applications should use advanced API (RFC2292) to specify scope index, or interface index. For similar purpose, sin6_scope_id member in sockaddr_in6 structure is defined in RFC2553. However, the semantics for sin6_scope_id is rather vague. If you care about portability of your application, we suggest you to use advanced API rather than sin6_scope_id. In the kernel, an interface index for link-local scoped address is embedded into 2nd 16bit-word (3rd and 4th byte) in IPv6 address. For example, you may see something like: [source,bash] .... fe80:1::200:f8ff:fe01:6317 .... in the routing table and interface address structure (struct in6_ifaddr). The address above is a link-local unicast address which belongs to a network interface whose interface identifier is 1. The embedded index enables us to identify IPv6 link local addresses over multiple interfaces effectively and with only a little code change. Routing daemons and configuration programs, like man:route6d[8] and man:ifconfig[8], will need to manipulate the "embedded" scope index. These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API will return IPv6 addresses with 2nd 16bit-word filled in. The APIs are for manipulating kernel internal structure. Programs that use these APIs have to be prepared about differences in kernels anyway. When you specify scoped address to the command line, NEVER write the embedded form (such as ff02:1::1 or fe80:2::fedc). This is not supposed to work. Always use standard form, like ff02::1 or fe80::fedc, with command line option for specifying interface (like `ping -6 -I ne0 ff02::1`). In general, if a command does not have command line option to specify outgoing interface, that command is not ready to accept scoped address. This may seem to be opposite from IPv6's premise to support "dentist office" situation. We believe that specifications need some improvements for this. Some of the userland tools support extended numeric IPv6 syntax, as documented in [.filename]#draft-ietf-ipngwg-scopedaddr-format-00.txt#. You can specify outgoing link, by using name of the outgoing interface like "fe80::1%ne0". This way you will be able to specify link-local scoped address without much trouble. To use this extension in your program, you will need to use man:getaddrinfo[3], and man:getnameinfo[3] with NI_WITHSCOPEID. The implementation currently assumes 1-to-1 relationship between a link and an interface, which is stronger than what specs say. [[ipv6-pnp]] ==== Plug and Play Most of the IPv6 stateless address autoconfiguration is implemented in the kernel. Neighbor Discovery functions are implemented in the kernel as a whole. Router Advertisement (RA) input for hosts is implemented in the kernel. Router Solicitation (RS) output for endhosts, RS input for routers, and RA output for routers are implemented in the userland. ===== Assignment of link-local, and special addresses IPv6 link-local address is generated from IEEE802 address (Ethernet MAC address). Each of interface is assigned an IPv6 link-local address automatically, when the interface becomes up (IFF_UP). Also, direct route for the link-local address is added to routing table. Here is an output of netstat command: [source,bash] .... Internet6: Destination Gateway Flags Netif Expire fe80:1::%ed0/64 link#1 UC ed0 fe80:2::%ep0/64 link#2 UC ep0 .... Interfaces that has no IEEE802 address (pseudo interfaces like tunnel interfaces, or ppp interfaces) will borrow IEEE802 address from other interfaces, such as Ethernet interfaces, whenever possible. If there is no IEEE802 hardware attached, a last resort pseudo-random value, MD5(hostname), will be used as source of link-local address. If it is not suitable for your usage, you will need to configure the link-local address manually. If an interface is not capable of handling IPv6 (such as lack of multicast support), link-local address will not be assigned to that interface. See section 2 for details. Each interface joins the solicited multicast address and the link-local all-nodes multicast addresses (e.g., fe80::1:ff01:6317 and ff02::1, respectively, on the link the interface is attached). In addition to a link-local address, the loopback address (::1) will be assigned to the loopback interface. Also, ::1/128 and ff01::/32 are automatically added to routing table, and loopback interface joins node-local multicast group ff01::1. ===== Stateless address autoconfiguration on Hosts In IPv6 specification, nodes are separated into two categories: _routers_ and _hosts_. Routers forward packets addressed to others, hosts does not forward the packets. net.inet6.ip6.forwarding defines whether this node is router or host (router if it is 1, host if it is 0). When a host hears Router Advertisement from the router, a host may autoconfigure itself by stateless address autoconfiguration. This behavior can be controlled by net.inet6.ip6.accept_rtadv (host autoconfigures itself if it is set to 1). By autoconfiguration, network address prefix for the receiving interface (usually global address prefix) is added. Default route is also configured. Routers periodically generate Router Advertisement packets. To request an adjacent router to generate RA packet, a host can transmit Router Solicitation. To generate a RS packet at any time, use the _rtsol_ command. man:rtsold[8] daemon is also available. man:rtsold[8] generates Router Solicitation whenever necessary, and it works great for nomadic usage (notebooks/laptops). If one wishes to ignore Router Advertisements, use sysctl to set net.inet6.ip6.accept_rtadv to 0. To generate Router Advertisement from a router, use the man:rtadvd[8] daemon. Note that, IPv6 specification assumes the following items, and nonconforming cases are left unspecified: * Only hosts will listen to router advertisements * Hosts have single network interface (except loopback) Therefore, this is unwise to enable net.inet6.ip6.accept_rtadv on routers, or multi-interface host. A misconfigured node can behave strange (nonconforming configuration allowed for those who would like to do some experiments). To summarize the sysctl knob: [source,bash] .... accept_rtadv forwarding role of the node --- --- --- 0 0 host (to be manually configured) 0 1 router 1 0 autoconfigured host (spec assumes that host has single interface only, autoconfigured host with multiple interface is out-of-scope) 1 1 invalid, or experimental (out-of-scope of spec) .... RFC2462 has validation rule against incoming RA prefix information option, in 5.5.3 (e). This is to protect hosts from malicious (or misconfigured) routers that advertise very short prefix lifetime. There was an update from Jim Bound to ipngwg mailing list (look for "(ipng 6712)" in the archive) and it is implemented Jim's update. -See crossref:ipv6[neighbor-discovery,23.5.1.2] in the document for relationship between DAD and autoconfiguration. +See crossref:ipv6[neighbor-discovery,Neighbor Discovery] in the document for relationship between DAD and autoconfiguration. [[gif]] ==== Generic Tunnel Interface GIF (Generic InterFace) is a pseudo interface for configured tunnel. Details are described in man:gif[4]. Currently * v6 in v6 * v6 in v4 * v4 in v6 * v4 in v4 are available. Use man:gifconfig[8] to assign physical (outer) source and destination address to gif interfaces. Configuration that uses same address family for inner and outer IP header (v4 in v4, or v6 in v6) is dangerous. It is very easy to configure interfaces and routing tables to perform infinite level of tunneling. _Please be warned_. gif can be configured to be ECN-friendly. -See crossref:ipv6[ipsec-ecn,23.5.4.5] for ECN-friendliness of tunnels, and man:gif[4] for how to configure. +See crossref:ipv6[ipsec-ecn,ECN Consideration on IPsec Tunnels] for ECN-friendliness of tunnels, and man:gif[4] for how to configure. If you would like to configure an IPv4-in-IPv6 tunnel with gif interface, read man:gif[4] carefully. You will need to remove IPv6 link-local address automatically assigned to the gif interface. [[ipv6-sas]] ==== Source Address Selection Current source selection rule is scope oriented (there are some exceptions - see below). For a given destination, a source IPv6 address is selected by the following rule: . If the source address is explicitly specified by the user (e.g., via the advanced API), the specified address is used. . If there is an address assigned to the outgoing interface (which is usually determined by looking up the routing table) that has the same scope as the destination address, the address is used. + This is the most typical case. . If there is no address that satisfies the above condition, choose a global address assigned to one of the interfaces on the sending node. . If there is no address that satisfies the above condition, and destination address is site local scope, choose a site local address assigned to one of the interfaces on the sending node. . If there is no address that satisfies the above condition, choose the address associated with the routing table entry for the destination. This is the last resort, which may cause scope violation. For instance, ::1 is selected for ff01::1, fe80:1::200:f8ff:fe01:6317 for -fe80:1::2a0:24ff:feab:839b (note that embedded interface index - described in crossref:ipv6[ipv6-scope-index,23.5.1.3] - helps us choose the right source address. +fe80:1::2a0:24ff:feab:839b (note that embedded interface index - described in crossref:ipv6[ipv6-scope-index,Scope Index] - helps us choose the right source address. Those embedded indices will not be on the wire). If the outgoing interface has multiple address for the scope, a source is selected longest match basis (rule 3). Suppose 2001:0DB8:808:1:200:f8ff:fe01:6317 and 2001:0DB8:9:124:200:f8ff:fe01:6317 are given to the outgoing interface. 2001:0DB8:808:1:200:f8ff:fe01:6317 is chosen as the source for the destination 2001:0DB8:800::1. Note that the above rule is not documented in the IPv6 spec. It is considered "up to implementation" item. There are some cases where we do not use the above rule. One example is connected TCP session, and we use the address kept in tcb as the source. Another example is source address for Neighbor Advertisement. Under the spec (RFC2461 7.2.2) NA's source should be the target address of the corresponding NS's target. In this case we follow the spec rather than the above longest-match rule. For new connections (when rule 1 does not apply), deprecated addresses (addresses with preferred lifetime = 0) will not be chosen as source address if other choices are available. If no other choices are available, deprecated address will be used as a last resort. If there are multiple choice of deprecated addresses, the above scope rule will be used to choose from those deprecated addresses. If you would like to prohibit the use of deprecated address for some reason, configure net.inet6.ip6.use_deprecated to 0. The issue related to deprecated address is described in RFC2462 5.5.4 (NOTE: there is some debate underway in IETF ipngwg on how to use "deprecated" address). [[ipv6-jumbo]] ==== Jumbo Payload The Jumbo Payload hop-by-hop option is implemented and can be used to send IPv6 packets with payloads longer than 65,535 octets. But currently no physical interface whose MTU is more than 65,535 is supported, so such payloads can be seen only on the loopback interface (i.e., lo0). If you want to try jumbo payloads, you first have to reconfigure the kernel so that the MTU of the loopback interface is more than 65,535 bytes; add the following to the kernel configuration file: `options "LARGE_LOMTU" #To test jumbo payload` and recompile the new kernel. Then you can test jumbo payloads by the man:ping[8] command with -6, -b and -s options. The -b option must be specified to enlarge the size of the socket buffer and the -s option specifies the length of the packet, which should be more than 65,535. For example, type as follows: [source,bash] .... % ping -6 -b 70000 -s 68000 ::1 .... The IPv6 specification requires that the Jumbo Payload option must not be used in a packet that carries a fragment header. If this condition is broken, an ICMPv6 Parameter Problem message must be sent to the sender. specification is followed, but you cannot usually see an ICMPv6 error caused by this requirement. When an IPv6 packet is received, the frame length is checked and compared to the length specified in the payload length field of the IPv6 header or in the value of the Jumbo Payload option, if any. If the former is shorter than the latter, the packet is discarded and statistics are incremented. You can see the statistics as output of man:netstat[8] command with `-s -p ip6' option: [source,bash] .... % netstat -s -p ip6 ip6: (snip) 1 with data size < data length .... So, kernel does not send an ICMPv6 error unless the erroneous packet is an actual Jumbo Payload, that is, its packet size is more than 65,535 bytes. As described above, currently no physical interface with such a huge MTU is supported, so it rarely returns an ICMPv6 error. TCP/UDP over jumbogram is not supported at this moment. This is because we have no medium (other than loopback) to test this. Contact us if you need this. IPsec does not work on jumbograms. This is due to some specification twists in supporting AH with jumbograms (AH header size influences payload length, and this makes it real hard to authenticate inbound packet with jumbo payload option as well as AH). There are fundamental issues in *BSD support for jumbograms. We would like to address those, but we need more time to finalize these. To name a few: * mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it will not hold jumbogram with len > 2G on 32bit architecture CPUs. If we would like to support jumbogram properly, the field must be expanded to hold 4G + IPv6 header + link-layer header. Therefore, it must be expanded to at least int64_t (u_int32_t is NOT enough). * We mistakingly use "int" to hold packet length in many places. We need to convert them into larger integral type. It needs a great care, as we may experience overflow during packet length computation. * We mistakingly check for ip6_plen field of IPv6 header for packet payload length in various places. We should be checking mbuf pkthdr.len instead. ip6_input() will perform sanity check on jumbo payload option on input, and we can safely use mbuf pkthdr.len afterwards. * TCP code needs a careful update in bunch of places, of course. ==== Loop Prevention in Header Processing IPv6 specification allows arbitrary number of extension headers to be placed onto packets. If we implement IPv6 packet processing code in the way BSD IPv4 code is implemented, kernel stack may overflow due to long function call chain. sys/netinet6 code is carefully designed to avoid kernel stack overflow, so sys/netinet6 code defines its own protocol switch structure, as "struct ip6protosw" (see [.filename]#netinet6/ip6protosw.h#). There is no such update to IPv4 part (sys/netinet) for compatibility, but small change is added to its pr_input() prototype. So "struct ipprotosw" is also defined. As a result, if you receive IPsec-over-IPv4 packet with massive number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay. (Of-course, for those all IPsec headers to be processed, each such IPsec header must pass each IPsec check. So an anonymous attacker will not be able to do such an attack.) [[icmpv6]] ==== ICMPv6 After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium. This is already implemented into the kernel. ==== Applications For userland programming, we support IPv6 socket API as specified in RFC2553, RFC2292 and upcoming Internet drafts. TCP/UDP over IPv6 is available and quite stable. You can enjoy man:telnet[1], man:ftp[1], man:rlogin[1], man:rsh[1], man:ssh[1], etc. These applications are protocol independent. That is, they automatically chooses IPv4 or IPv6 according to DNS. ==== Kernel Internals While ip_forward() calls ip_output(), ip6_forward() directly calls if_output() since routers must not divide IPv6 packets into fragments. ICMPv6 should contain the original packet as long as possible up to 1280. UDP6/IP6 port unreach, for instance, should contain all extension headers and the *unchanged* UDP6 and IP6 headers. So, all IP6 functions except TCP never convert network byte order into host byte order, to save the original packet. tcp_input(), udp6_input() and icmp6_input() can not assume that IP6 header is preceding the transport headers due to extension headers. So, in6_cksum() was implemented to handle packets whose IP6 header and transport header is not continuous. TCP/IP6 nor UDP6/IP6 header structures do not exist for checksum calculation. To process IP6 header, extension headers and transport headers easily, network drivers are now required to store packets in one internal mbuf or one or more external mbufs. A typical old driver prepares two internal mbufs for 96 - 204 bytes data, however, now such packet data is stored in one external mbuf. `netstat -s -p ip6` tells you whether or not your driver conforms such requirement. In the following example, "cce0" violates the requirement. (For more information, refer to Section 2.) [source,bash] .... Mbuf statistics: 317 one mbuf two or more mbuf:: lo0 = 8 cce0 = 10 3282 one ext mbuf 0 two or more ext mbuf .... Each input function calls IP6_EXTHDR_CHECK in the beginning to check if the region between IP6 and its header is continuous. IP6_EXTHDR_CHECK calls m_pullup() only if the mbuf has M_LOOP flag, that is, the packet comes from the loopback interface. m_pullup() is never called for packets coming from physical network interfaces. Both IP and IP6 reassemble functions never call m_pullup(). [[ipv6-wildcard-socket]] ==== IPv4 Mapped Address and IPv6 Wildcard Socket RFC2553 describes IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind socket (3.8). The spec allows you to: * Accept IPv4 connections by AF_INET6 wildcard bind socket. * Transmit IPv4 packet over AF_INET6 socket by using special form of the address like ::ffff:10.1.1.1. but the spec itself is very complicated and does not specify how the socket layer should behave. Here we call the former one "listening side" and the latter one "initiating side", for reference purposes. You can perform wildcard bind on both of the address families, on the same port. The following table show the behavior of FreeBSD 4.x. [source,bash] .... listening side initiating side (AF_INET6 wildcard (connection to ::ffff:10.1.1.1) socket gets IPv4 conn.) --- --- FreeBSD 4.x configurable supported default: enabled .... The following sections will give you more details, and how you can configure the behavior. Comments on listening side: It looks that RFC2553 talks too little on wildcard bind issue, especially on the port space issue, failure mode and relationship between AF_INET/INET6 wildcard bind. There can be several separate interpretation for this RFC which conform to it but behaves differently. So, to implement portable application you should assume nothing about the behavior in the kernel. Using man:getaddrinfo[3] is the safest way. Port number space and wildcard bind issues were discussed in detail on ipv6imp mailing list, in mid March 1999 and it looks that there is no concrete consensus (means, up to implementers). You may want to check the mailing list archives. If a server application would like to accept IPv4 and IPv6 connections, there will be two alternatives. One is using AF_INET and AF_INET6 socket (you will need two sockets). Use man:getaddrinfo[3] with AI_PASSIVE into ai_flags, and man:socket[2] and man:bind[2] to all the addresses returned. By opening multiple sockets, you can accept connections onto the socket with proper address family. IPv4 connections will be accepted by AF_INET socket, and IPv6 connections will be accepted by AF_INET6 socket. Another way is using one AF_INET6 wildcard bind socket. Use man:getaddrinfo[3] with AI_PASSIVE into ai_flags and with AF_INET6 into ai_family, and set the 1st argument hostname to NULL. And man:socket[2] and man:bind[2] to the address returned. (should be IPv6 unspecified addr). You can accept either of IPv4 and IPv6 packet via this one socket. To support only IPv6 traffic on AF_INET6 wildcard binded socket portably, always check the peer address when a connection is made toward AF_INET6 listening socket. If the address is IPv4 mapped address, you may want to reject the connection. You can check the condition by using IN6_IS_ADDR_V4MAPPED() macro. To resolve this issue more easily, there is system dependent man:setsockopt[2] option, IPV6_BINDV6ONLY, used like below. [.programlisting] .... int on; setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY, (char *)&on, sizeof (on)) < 0)); .... When this call succeed, then this socket only receive IPv6 packets. Comments on initiating side: Advise to application implementers: to implement a portable IPv6 application (which works on multiple IPv6 kernels), we believe that the following is the key to the success: * NEVER hardcode AF_INET nor AF_INET6. * Use man:getaddrinfo[3] and man:getnameinfo[3] throughout the system. Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*(). (To update existing applications to be IPv6 aware easily, sometime getipnodeby*() will be useful. But if possible, try to rewrite the code to use man:getaddrinfo[3] and man:getnameinfo[3].) * If you would like to connect to destination, use man:getaddrinfo[3] and try all the destination returned, like man:telnet[1] does. * Some of the IPv6 stack is shipped with buggy man:getaddrinfo[3]. Ship a minimal working version with your application and use that as last resort. If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing connection, you will need to use man:getipnodebyname[3]. When you would like to update your existing application to be IPv6 aware with minimal effort, this approach might be chosen. But please note that it is a temporal solution, because man:getipnodebyname[3] itself is not recommended as it does not handle scoped IPv6 addresses at all. For IPv6 name resolution, man:getaddrinfo[3] is the preferred API. So you should rewrite your application to use man:getaddrinfo[3], when you get the time to do it. When writing applications that make outgoing connections, story goes much simpler if you treat AF_INET and AF_INET6 as totally separate address family. {set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do not recommend you to rely upon IPv4 mapped address. ===== unified tcp and inpcb code FreeBSD 4.x uses shared tcp code between IPv4 and IPv6 (from sys/netinet/tcp*) and separate udp4/6 code. It uses unified inpcb structure. The platform can be configured to support IPv4 mapped address. Kernel configuration is summarized as follows: * By default, AF_INET6 socket will grab IPv4 connections in certain condition, and can initiate connection to IPv4 destination embedded in IPv4 mapped IPv6 address. * You can disable it on entire system with sysctl like below. + `sysctl net.inet6.ip6.mapped_addr=0` ====== Listening Side Each socket can be configured to support special AF_INET6 wildcard bind (enabled by default). You can disable it on each socket basis with man:setsockopt[2] like below. [.programlisting] .... int on; setsockopt(s, IPPROTO_IPV6, IPV6_BINDV6ONLY, (char *)&on, sizeof (on)) < 0)); .... Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following conditions are satisfied: * there is no AF_INET socket that matches the IPv4 connection * the AF_INET6 socket is configured to accept IPv4 traffic, i.e., getsockopt(IPV6_BINDV6ONLY) returns 0. There is no problem with open/close ordering. ====== Initiating Side FreeBSD 4.x supports outgoing connection to IPv4 mapped address (::ffff:10.1.1.1), if the node is configured to support IPv4 mapped address. ==== sockaddr_storage When RFC2553 was about to be finalized, there was discussion on how struct sockaddr_storage members are named. One proposal is to prepend "__" to the members (like "__ss_len") as they should not be touched. The other proposal was not to prepend it (like "ss_len") as we need to touch those members directly. There was no clear consensus on it. As a result, RFC2553 defines struct sockaddr_storage as follows: [.programlisting] .... struct sockaddr_storage { u_char __ss_len; /* address length */ u_char __ss_family; /* address family */ /* and bunch of padding */ }; .... On the contrary, XNET draft defines as follows: [.programlisting] .... struct sockaddr_storage { u_char ss_len; /* address length */ u_char ss_family; /* address family */ /* and bunch of padding */ }; .... In December 1999, it was agreed that RFC2553bis should pick the latter (XNET) definition. Current implementation conforms to XNET definition, based on RFC2553bis discussion. If you look at multiple IPv6 implementations, you will be able to see both definitions. As an userland programmer, the most portable way of dealing with it is to: . ensure ss_family and/or ss_len are available on the platform, by using GNU autoconf, . have -Dss_family=__ss_family to unify all occurrences (including header file) into __ss_family, or . never touch __ss_family. cast to sockaddr * and use sa_family like: + [.programlisting] .... struct sockaddr_storage ss; family = ((struct sockaddr *)&ss)->sa_family .... === Network Drivers Now following two items are required to be supported by standard drivers: . mbuf clustering requirement. In this stable release, we changed MINCLSIZE into MHLEN+1 for all the operating systems in order to make all the drivers behave as we expect. . multicast. If man:ifmcstat[8] yields no multicast group for a interface, that interface has to be patched. If any of the drivers do not support the requirements, then the drivers cannot be used for IPv6 and/or IPsec communication. If you find any problem with your card using IPv6/IPsec, then, please report it to the {freebsd-bugs}. (NOTE: In the past we required all PCMCIA drivers to have a call to in6_ifattach(). We have no such requirement any more) === Translator We categorize IPv4/IPv6 translator into 4 types: * _Translator A_ --- It is used in the early stage of transition to make it possible to establish a connection from an IPv6 host in an IPv6 island to an IPv4 host in the IPv4 ocean. * _Translator B_ --- It is used in the early stage of transition to make it possible to establish a connection from an IPv4 host in the IPv4 ocean to an IPv6 host in an IPv6 island. * _Translator C_ --- It is used in the late stage of transition to make it possible to establish a connection from an IPv4 host in an IPv4 island to an IPv6 host in the IPv6 ocean. * _Translator D_ --- It is used in the late stage of transition to make it possible to establish a connection from an IPv6 host in the IPv6 ocean to an IPv4 host in an IPv4 island. [[ipsec-implementation]] === IPsec IPsec is mainly organized by three components. . Policy Management . Key Management . AH and ESP handling ==== Policy Management The kernel implements experimental policy management code. There are two way to manage security policy. One is to configure per-socket policy using man:setsockopt[2]. In this cases, policy configuration is described in man:ipsec_set_policy[3]. The other is to configure kernel packet filter-based policy using PF_KEY interface, via man:setkey[8]. The policy entry is not re-ordered with its indexes, so the order of entry when you add is very significant. ==== Key Management The key management code implemented in this kit (sys/netkey) is a home-brew PFKEY v2 implementation. This conforms to RFC2367. The home-brew IKE daemon, "racoon" is included in the kit (kame/kame/racoon). Basically you will need to run racoon as daemon, then set up a policy to require keys (like `ping -P 'out ipsec esp/transport//use'`). The kernel will contact racoon daemon as necessary to exchange keys. ==== AH and ESP Handling IPsec module is implemented as "hooks" to the standard IPv4/IPv6 processing. When sending a packet, ip{,6}_output() checks if ESP/AH processing is required by checking if a matching SPD (Security Policy Database) is found. If ESP/AH is needed, {esp,ah}{4,6}_output() will be called and mbuf will be updated accordingly. When a packet is received, {esp,ah}4_input() will be called based on protocol number, i.e., (*inetsw[proto])(). {esp,ah}4_input() will decrypt/check authenticity of the packet, and strips off daisy-chained header and padding for ESP/AH. It is safe to strip off the ESP/AH header on packet reception, since we will never use the received packet in "as is" form. By using ESP/AH, TCP4/6 effective data segment size will be affected by extra daisy-chained headers inserted by ESP/AH. Our code takes care of the case. Basic crypto functions can be found in directory "sys/crypto". ESP/AH transform are listed in {esp,ah}_core.c with wrapper functions. If you wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and add your crypto algorithm code into sys/crypto. Tunnel mode is partially supported in this release, with the following restrictions: * IPsec tunnel is not combined with GIF generic tunneling interface. It needs a great care because we may create an infinite loop between ip_output() and tunnelifp->if_output(). Opinion varies if it is better to unify them, or not. * MTU and Don't Fragment bit (IPv4) considerations need more checking, but basically works fine. * Authentication model for AH tunnel must be revisited. We will need to improve the policy management engine, eventually. ==== Conformance to RFCs and IDs The IPsec code in the kernel conforms (or, tries to conform) to the following standards: "old IPsec" specification documented in [.filename]#rfc182[5-9].txt# "new IPsec" specification documented in [.filename]#rfc240[1-6].txt#, [.filename]#rfc241[01].txt#, [.filename]#rfc2451.txt# and [.filename]#draft-mcdonald-simple-ipsec-api-01.txt# (draft expired, but you can take from link:ftp://ftp.kame.net/pub/internet-drafts/[ ftp://ftp.kame.net/pub/internet-drafts/]). (NOTE: IKE specifications, [.filename]#rfc241[7-9].txt# are implemented in userland, as "racoon" IKE daemon) Currently supported algorithms are: * old IPsec AH ** null crypto checksum (no document, just for debugging) ** keyed MD5 with 128bit crypto checksum ([.filename]#rfc1828.txt#) ** keyed SHA1 with 128bit crypto checksum (no document) ** HMAC MD5 with 128bit crypto checksum ([.filename]#rfc2085.txt#) ** HMAC SHA1 with 128bit crypto checksum (no document) * old IPsec ESP ** null encryption (no document, similar to [.filename]#rfc2410.txt#) ** DES-CBC mode ([.filename]#rfc1829.txt#) * new IPsec AH ** null crypto checksum (no document, just for debugging) ** keyed MD5 with 96bit crypto checksum (no document) ** keyed SHA1 with 96bit crypto checksum (no document) ** HMAC MD5 with 96bit crypto checksum ([.filename]#rfc2403.txt#) ** HMAC SHA1 with 96bit crypto checksum ([.filename]#rfc2404.txt#) * new IPsec ESP ** null encryption ([.filename]#rfc2410.txt#) ** DES-CBC with derived IV ([.filename]#draft-ietf-ipsec-ciph-des-derived-01.txt#, draft expired) ** DES-CBC with explicit IV ([.filename]#rfc2405.txt#) ** 3DES-CBC with explicit IV ([.filename]#rfc2451.txt#) ** BLOWFISH CBC ([.filename]#rfc2451.txt#) ** CAST128 CBC ([.filename]#rfc2451.txt#) ** RC5 CBC ([.filename]#rfc2451.txt#) ** each of the above can be combined with: *** ESP authentication with HMAC-MD5(96bit) *** ESP authentication with HMAC-SHA1(96bit) The following algorithms are NOT supported: * old IPsec AH ** HMAC MD5 with 128bit crypto checksum + 64bit replay prevention ([.filename]#rfc2085.txt#) ** keyed SHA1 with 160bit crypto checksum + 32bit padding ([.filename]#rfc1852.txt#) IPsec (in kernel) and IKE (in userland as "racoon") has been tested at several interoperability test events, and it is known to interoperate with many other implementations well. Also, current IPsec implementation as quite wide coverage for IPsec crypto algorithms documented in RFC (we cover algorithms without intellectual property issues only). [[ipsec-ecn]] ==== ECN Consideration on IPsec Tunnels ECN-friendly IPsec tunnel is supported as described in [.filename]#draft-ipsec-ecn-00.txt#. Normal IPsec tunnel is described in RFC2401. On encapsulation, IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner IP header to outer IP header. On decapsulation outer IP header will be simply dropped. The decapsulation rule is not compatible with ECN, since ECN bit on the outer IP TOS/traffic class field will be lost. To make IPsec tunnel ECN-friendly, we should modify encapsulation and decapsulation procedure. This is described in http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt[ http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt], chapter 3. IPsec tunnel implementation can give you three behaviors, by setting net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value: * RFC2401: no consideration for ECN (sysctl value -1) * ECN forbidden (sysctl value 0) * ECN allowed (sysctl value 1) Note that the behavior is configurable in per-node manner, not per-SA manner (draft-ipsec-ecn-00 wants per-SA configuration, but it looks too much for me). The behavior is summarized as follows (see source code for more detail): [source,bash] .... encapsulate decapsulate --- --- RFC2401 copy all TOS bits drop TOS bits on outer from inner to outer. (use inner TOS bits as is) ECN forbidden copy TOS bits except for ECN drop TOS bits on outer (masked with 0xfc) from inner (use inner TOS bits as is) to outer. set ECN bits to 0. ECN allowed copy TOS bits except for ECN use inner TOS bits with some CE (masked with 0xfe) from change. if outer ECN CE bit inner to outer. is 1, enable ECN CE bit on set ECN CE bit to 0. the inner. .... General strategy for configuration is as follows: * if both IPsec tunnel endpoint are capable of ECN-friendly behavior, you should better configure both end to "ECN allowed" (sysctl value 1). * if the other end is very strict about TOS bit, use "RFC2401" (sysctl value -1). * in other cases, use "ECN forbidden" (sysctl value 0). The default behavior is "ECN forbidden" (sysctl value 0). For more information, please refer to: http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt[ http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt], RFC2481 (Explicit Congestion Notification), src/sys/netinet6/{ah,esp}_input.c (Thanks goes to Kenjiro Cho mailto:kjc@csl.sony.co.jp[kjc@csl.sony.co.jp] for detailed analysis) ==== Interoperability Here are (some of) platforms that KAME code have tested IPsec/IKE interoperability in the past. Note that both ends may have modified their implementation, so use the following list just for reference purposes. Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure), Ericsson ACC, FreeS/WAN, HITACHI, IBM AIX(R), IIJ, Intel, Microsoft(R) Windows NT(R), NIST (linux IPsec + plutoplus), Netscreen, OpenBSD, RedCreek, Routerware, SSH, Secure Computing, Soliton, Toshiba, VPNet, Yamaha RT100i diff --git a/documentation/content/en/books/developers-handbook/l10n/_index.adoc b/documentation/content/en/books/developers-handbook/l10n/_index.adoc index 5760eaef62..34153230f6 100644 --- a/documentation/content/en/books/developers-handbook/l10n/_index.adoc +++ b/documentation/content/en/books/developers-handbook/l10n/_index.adoc @@ -1,272 +1,272 @@ --- title: Chapter 4. Localization and Internationalization - L10N and I18N authors: prev: books/developers-handbook/secure next: books/developers-handbook/policies description: Localization and Internationalization - L10N and I18N in FreeBSD tags: ["L10N", "I18N", "Localization", "Internationalization", "FreeBSD"] showBookMenu: true weight: 5 params: path: "/books/developers-handbook/l10n/" --- [[l10n]] = Localization and Internationalization - L10N and I18N :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 4 :partnums: :source-highlighter: rouge :experimental: :images-path: books/developers-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[l10n-programming]] == Programming I18N Compliant Applications To make your application more useful for speakers of other languages, we hope that you will program I18N compliant. The GNU gcc compiler and GUI libraries like QT and GTK support I18N through special handling of strings. Making a program I18N compliant is very easy. It allows contributors to port your application to other languages quickly. Refer to the library specific I18N documentation for more details. In contrast with common perception, I18N compliant code is easy to write. Usually, it only involves wrapping your strings with library specific functions. In addition, please be sure to allow for wide or multibyte character support. === A Call to Unify the I18N Effort It has come to our attention that the individual I18N/L10N efforts for each country has been repeating each others' efforts. Many of us have been reinventing the wheel repeatedly and inefficiently. We hope that the various major groups in I18N could congregate into a group effort similar to the Core Team's responsibility. Currently, we hope that, when you write or port I18N programs, you would send it out to each country's related FreeBSD mailing list for testing. In the future, we hope to create applications that work in all the languages out-of-the-box without dirty hacks. The {freebsd-i18n} has been established. If you are an I18N/L10N developer, please send your comments, ideas, questions, and anything you deem related to it. === Perl and Python Perl and Python have I18N and wide character handling libraries. Please use them for I18N compliance. [[posix-nls]] == Localized Messages with POSIX.1 Native Language Support (NLS) Beyond the basic I18N functions, like supporting various input encodings or supporting national conventions, such as the different decimal separators, at a higher level of I18N, it is possible to localize the messages written to the output by the various programs. A common way of doing this is using the POSIX.1 NLS functions, which are provided as a part of the FreeBSD base system. [[nls-catalogs]] === Organizing Localized Messages into Catalog Files POSIX.1 NLS is based on catalog files, which contain the localized messages in the desired encoding. The messages are organized into sets and each message is identified by an integer number in the containing set. The catalog files are conventionally named after the locale they contain localized messages for, followed by the `.msg` extension. For instance, the Hungarian messages for ISO8859-2 encoding should be stored in a file called [.filename]#hu_HU.ISO8859-2#. These catalog files are common text files that contain the numbered messages. It is possible to write comments by starting the line with a `$` sign. Set boundaries are also separated by special comments, where the keyword `set` must directly follow the `$` sign. The `set` keyword is then followed by the set number. For example: [.programlisting] .... $set 1 .... The actual message entries start with the message number and followed by the localized message. The well-known modifiers from man:printf[3] are accepted: [.programlisting] .... 15 "File not found: %s\n" .... The language catalog files have to be compiled into a binary form before they can be opened from the program. This conversion is done with the man:gencat[1] utility. Its first argument is the filename of the compiled catalog and its further arguments are the input catalogs. The localized messages can also be organized into more catalog files and then all of them can be processed with man:gencat[1]. [[nls-using]] === Using the Catalog Files from the Source Code Using the catalog files is simple. To use the related functions, [.filename]#nl_types.h# must be included. Before using a catalog, it has to be opened with man:catopen[3]. The function takes two arguments. The first parameter is the name of the installed and compiled catalog. Usually, the name of the program is used, such as grep. This name will be used when looking for the compiled catalog file. The man:catopen[3] call looks for this file in [.filename]#/usr/share/nls/locale/catname# and in [.filename]#/usr/local/share/nls/locale/catname#, where `locale` is the locale set and `catname` is the catalog name being discussed. The second parameter is a constant, which can have two values: * `NL_CAT_LOCALE`, which means that the used catalog file will be based on `LC_MESSAGES`. * `0`, which means that `LANG` has to be used to open the proper catalog. The man:catopen[3] call returns a catalog identifier of type `nl_catd`. Please refer to the manual page for a list of possible returned error codes. After opening a catalog man:catgets[3] can be used to retrieve a message. The first parameter is the catalog identifier returned by man:catopen[3], the second one is the number of the set, the third one is the number of the messages, and the fourth one is a fallback message, which will be returned if the requested message cannot be retrieved from the catalog file. After using the catalog file, it must be closed by calling man:catclose[3], which has one argument, the catalog id. [[nls-example]] === A Practical Example The following example will demonstrate an easy solution on how to use NLS catalogs in a flexible way. The below lines need to be put into a common header file of the program, which is included into all source files where localized messages are necessary: [.programlisting] .... #ifdef WITHOUT_NLS #define getstr(n) nlsstr[n] #else -#include nl_types.h +#include extern nl_catd catalog; #define getstr(n) catgets(catalog, 1, n, nlsstr[n]) #endif extern char *nlsstr[]; .... Next, put these lines into the global declaration part of the main source file: [.programlisting] .... #ifndef WITHOUT_NLS -#include nl_types.h +#include nl_catd catalog; #endif /* * Default messages to use when NLS is disabled or no catalog * is found. */ char *nlsstr[] = { "", /* 1*/ "some random message", /* 2*/ "some other message" }; .... Next come the real code snippets, which open, read, and close the catalog: [.programlisting] .... #ifndef WITHOUT_NLS catalog = catopen("myapp", NL_CAT_LOCALE); #endif ... printf(getstr(1)); ... #ifndef WITHOUT_NLS catclose(catalog); #endif .... ==== Reducing Strings to Localize There is a good way of reducing the strings that need to be localized by using libc error messages. This is also useful to just avoid duplication and provide consistent error messages for the common errors that can be encountered by a great many of programs. First, here is an example that does not use libc error messages: [.programlisting] .... -#include err.h +#include ... if (!S_ISDIR(st.st_mode)) errx(1, "argument is not a directory"); .... This can be transformed to print an error message by reading `errno` and printing an error message accordingly: [.programlisting] .... -#include err.h -#include errno.h +#include +#include ... if (!S_ISDIR(st.st_mode)) { errno = ENOTDIR; err(1, NULL); } .... In this example, the custom string is eliminated, thus translators will have less work when localizing the program and users will see the usual "Not a directory" error message when they encounter this error. This message will probably seem more familiar to them. Please note that it was necessary to include [.filename]#errno.h# in order to directly access `errno`. It is worth to note that there are cases when `errno` is set automatically by a preceding call, so it is not necessary to set it explicitly: [.programlisting] .... -#include err.h +#include ... if ((p = malloc(size)) == NULL) err(1, NULL); .... [[nls-mk]] === Making use of [.filename]#bsd.nls.mk# Using the catalog files requires few repeatable steps, such as compiling the catalogs and installing them to the proper location. In order to simplify this process even more, [.filename]#bsd.nls.mk# introduces some macros. It is not necessary to include [.filename]#bsd.nls.mk# explicitly, it is pulled in from the common Makefiles, such as [.filename]#bsd.prog.mk# or [.filename]#bsd.lib.mk#. Usually it is enough to define `NLSNAME`, which should have the catalog name mentioned as the first argument of man:catopen[3] and list the catalog files in `NLS` without their `.msg` extension. Here is an example, which makes it possible to to disable NLS when used with the code examples before. The `WITHOUT_NLS` man:make[1] variable has to be defined in order to build the program without NLS support. [.programlisting] .... .if !defined(WITHOUT_NLS) NLS= es_ES.ISO8859-1 NLS+= hu_HU.ISO8859-2 NLS+= pt_BR.ISO8859-1 .else CFLAGS+= -DWITHOUT_NLS .endif .... Conventionally, the catalog files are placed under the [.filename]#nls# subdirectory and this is the default behavior of [.filename]#bsd.nls.mk#. It is possible, though to override the location of the catalogs with the `NLSSRCDIR` man:make[1] variable. The default name of the precompiled catalog files also follow the naming convention mentioned before. It can be overridden by setting the `NLSNAME` variable. There are other options to fine tune the processing of the catalog files but usually it is not needed, thus they are not described here. For further information on [.filename]#bsd.nls.mk#, please refer to the file itself, it is short and easy to understand. diff --git a/documentation/content/en/books/developers-handbook/x86/_index.adoc b/documentation/content/en/books/developers-handbook/x86/_index.adoc index de7cd9e992..b0ca446359 100644 --- a/documentation/content/en/books/developers-handbook/x86/_index.adoc +++ b/documentation/content/en/books/developers-handbook/x86/_index.adoc @@ -1,4311 +1,4311 @@ --- title: Chapter 11. x86 Assembly Language Programming authors: prev: books/developers-handbook/partiv next: books/developers-handbook/partv description: x86 Assembly Language Programming tags: ["x86", "guide"] showBookMenu: true weight: 15 params: path: "/books/developers-handbook/x86/" --- [[x86]] = x86 Assembly Language Programming :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: A :partnums: :source-highlighter: rouge :experimental: :images-path: books/developers-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] _This chapter was written by {stanislav}._ [[x86-intro]] == Synopsis Assembly language programming under UNIX(R) is highly undocumented. It is generally assumed that no one would ever want to use it because various UNIX(R) systems run on different microprocessors, so everything should be written in C for portability. In reality, C portability is quite a myth. Even C programs need to be modified when ported from one UNIX(R) to another, regardless of what processor each runs on. Typically, such a program is full of conditional statements depending on the system it is compiled for. Even if we believe that all of UNIX(R) software should be written in C, or some other high-level language, we still need assembly language programmers: Who else would write the section of C library that accesses the kernel? In this chapter I will attempt to show you how you can use assembly language writing UNIX(R) programs, specifically under FreeBSD. This chapter does not explain the basics of assembly language. There are enough resources about that (for a complete online course in assembly language, see Randall Hyde's http://webster.cs.ucr.edu/[Art of Assembly Language]; or if you prefer a printed book, take a look at Jeff Duntemann's Assembly Language Step-by-Step (ISBN: 0471375233). However, once the chapter is finished, any assembly language programmer will be able to write programs for FreeBSD quickly and efficiently. Copyright (R) 2000-2001 G. Adam Stanislav. All rights reserved. [[x86-the-tools]] == The Tools [[x86-the-assembler]] === The Assembler The most important tool for assembly language programming is the assembler, the software that converts assembly language code into machine language. Three very different assemblers are available for FreeBSD. Both man:llvm-as[1] (included in package:devel/llvm[]) and man:as[1] (included in package:devel/binutils[]) use the traditional UNIX(R) assembly language syntax. On the other hand, man:nasm[1] (installed through package:devel/nasm[]) uses the Intel syntax. Its main advantage is that it can assemble code for many operating systems. This chapter uses nasm syntax because most assembly language programmers coming to FreeBSD from other operating systems will find it easier to understand. And, because, quite frankly, that is what I am used to. [[x86-the-linker]] === The Linker The output of the assembler, like that of any compiler, needs to be linked to form an executable file. The standard man:ld[1] linker comes with FreeBSD. It works with the code assembled with either assembler. [[x86-system-calls]] == System Calls [[x86-default-calling-convention]] === Default Calling Convention By default, the FreeBSD kernel uses the C calling convention. Further, although the kernel is accessed using `int 80h`, it is assumed the program will call a function that issues `int 80h`, rather than issuing `int 80h` directly. This convention is very convenient, and quite superior to the Microsoft(R) convention used by MS-DOS(R). Why? Because the UNIX(R) convention allows any program written in any language to access the kernel. An assembly language program can do that as well. For example, we could open a file: [.programlisting] .... kernel: int 80h ; Call kernel ret open: push dword mode push dword flags push dword path mov eax, 5 call kernel add esp, byte 12 ret .... This is a very clean and portable way of coding. If you need to port the code to a UNIX(R) system which uses a different interrupt, or a different way of passing parameters, all you need to change is the kernel procedure. But assembly language programmers like to shave off cycles. The above example requires a `call/ret` combination. We can eliminate it by ``push``ing an extra dword: [.programlisting] .... open: push dword mode push dword flags push dword path mov eax, 5 push eax ; Or any other dword int 80h add esp, byte 16 .... The `5` that we have placed in `EAX` identifies the kernel function, in this case `open`. [[x86-alternate-calling-convention]] === Alternate Calling Convention FreeBSD is an extremely flexible system. It offers other ways of calling the kernel. For it to work, however, the system must have Linux emulation installed. Linux is a UNIX(R) like system. However, its kernel uses the same system-call convention of passing parameters in registers MS-DOS(R) does. As with the UNIX(R) convention, the function number is placed in `EAX`. The parameters, however, are not passed on the stack but in `EBX, ECX, EDX, ESI, EDI, EBP`: [.programlisting] .... open: mov eax, 5 mov ebx, path mov ecx, flags mov edx, mode int 80h .... -This convention has a great disadvantage over the UNIX(R) way, at least as far as assembly language programming is concerned: +This convention has a great disadvantage over the UNIX(R) way, at least as far as assembly language programming is concerned: Every time you make a kernel call you must `push` the registers, then `pop` them later. This makes your code bulkier and slower. Nevertheless, FreeBSD gives you a choice. If you do choose the Linux convention, you must let the system know about it. After your program is assembled and linked, you need to brand the executable: [source,shell] .... % brandelf -t Linux filename .... [[x86-use-geneva]] === Which Convention Should You Use? If you are coding specifically for FreeBSD, you should always use the UNIX(R) convention: It is faster, you can store global variables in registers, you do not have to brand the executable, and you do not impose the installation of the Linux emulation package on the target system. If you want to create portable code that can also run on Linux, you will probably still want to give the FreeBSD users as efficient a code as possible. I will show you how you can accomplish that after I have explained the basics. [[x86-call-numbers]] === Call Numbers To tell the kernel which system service you are calling, place its number in `EAX`. Of course, you need to know what the number is. [[x86-the-syscalls-file]] ==== The [.filename]#syscalls# File The numbers are listed in [.filename]#syscalls#. `locate syscalls` finds this file in several different formats, all produced automatically from [.filename]#syscalls.master#. You can find the master file for the default UNIX(R) calling convention in [.filename]#/usr/src/sys/kern/syscalls.master#. If you need to use the other convention implemented in the Linux emulation mode, read [.filename]#/usr/src/sys/i386/linux/syscalls.master#. [NOTE] ==== Not only do FreeBSD and Linux use different calling conventions, they sometimes use different numbers for the same functions. ==== [.filename]#syscalls.master# describes how the call is to be made: [.programlisting] .... 0 STD NOHIDE { int nosys(void); } syscall nosys_args int 1 STD NOHIDE { void exit(int rval); } exit rexit_args void 2 STD POSIX { int fork(void); } 3 STD POSIX { ssize_t read(int fd, void *buf, size_t nbyte); } 4 STD POSIX { ssize_t write(int fd, const void *buf, size_t nbyte); } 5 STD POSIX { int open(char *path, int flags, int mode); } 6 STD POSIX { int close(int fd); } etc... .... It is the leftmost column that tells us the number to place in `EAX`. The rightmost column tells us what parameters to `push`. They are ``push``ed _from right to left_. For example, to `open` a file, we need to `push` the `mode` first, then `flags`, then the address at which the `path` is stored. [[x86-return-values]] == Return Values A system call would not be useful most of the time if it did not return some kind of a value: The file descriptor of an open file, the number of bytes read to a buffer, the system time, etc. Additionally, the system needs to inform us if an error occurs: A file does not exist, system resources are exhausted, we passed an invalid parameter, etc. [[x86-man-pages]] === Man Pages The traditional place to look for information about various system calls under UNIX(R) systems are the manual pages. FreeBSD describes its system calls in section 2, sometimes in section 3. For example, man:open[2] says: [.blockquote] If successful, `open()` returns a non-negative integer, termed a file descriptor. It returns `-1` on failure, and sets `errno` to indicate the error. The assembly language programmer new to UNIX(R) and FreeBSD will immediately ask the puzzling question: Where is `errno` and how do I get to it? [NOTE] ==== The information presented in the manual pages applies to C programs. The assembly language programmer needs additional information. ==== [[x86-where-return-values]] === Where Are the Return Values? Unfortunately, it depends... For most system calls it is in `EAX`, but not for all. A good rule of thumb, when working with a system call for the first time, is to look for the return value in `EAX`. If it is not there, you need further research. [NOTE] ==== I am aware of one system call that returns the value in `EDX`: `SYS_fork`. All others I have worked with use `EAX`. But I have not worked with them all yet. ==== [TIP] ==== If you cannot find the answer here or anywhere else, study libc source code and see how it interfaces with the kernel. ==== [[x86-where-errno]] === Where Is `errno`? Actually, nowhere... `errno` is part of the C language, not the UNIX(R) kernel. When accessing kernel services directly, the error code is returned in `EAX`, the same register the proper return value generally ends up in. This makes perfect sense. If there is no error, there is no error code. If there is an error, there is no return value. One register can contain either. [[x86-how-to-know-error]] === Determining an Error Occurred When using the standard FreeBSD calling convention, the `carry flag` is cleared upon success, set upon failure. When using the Linux emulation mode, the signed value in `EAX` is non-negative upon success, and contains the return value. In case of an error, the value is negative, i.e., `-errno`. [[x86-portable-code]] == Creating Portable Code Portability is generally not one of the strengths of assembly language. Yet, writing assembly language programs for different platforms is possible, especially with nasm. I have written assembly language libraries that can be assembled for such different operating systems as Windows(R) and FreeBSD. It is all the more possible when you want your code to run on two platforms which, while different, are based on similar architectures. For example, FreeBSD is UNIX(R), Linux is UNIX(R) like. I only mentioned three differences between them (from an assembly language programmer's perspective): The calling convention, the function numbers, and the way of returning values. [[x86-deal-with-function-numbers]] === Dealing with Function Numbers In many cases the function numbers are the same. However, even when they are not, the problem is easy to deal with: Instead of using numbers in your code, use constants which you have declared differently depending on the target architecture: [.programlisting] .... %ifdef LINUX %define SYS_execve 11 %else %define SYS_execve 59 %endif .... [[x86-deal-with-geneva]] === Dealing with Conventions Both, the calling convention, and the return value (the `errno` problem) can be resolved with macros: [.programlisting] .... %ifdef LINUX %macro system 0 call kernel %endmacro align 4 kernel: push ebx push ecx push edx push esi push edi push ebp mov ebx, [esp+32] mov ecx, [esp+36] mov edx, [esp+40] mov esi, [esp+44] mov ebp, [esp+48] int 80h pop ebp pop edi pop esi pop edx pop ecx pop ebx or eax, eax js .errno clc ret .errno: neg eax stc ret %else %macro system 0 int 80h %endmacro %endif .... [[x86-deal-with-other-portability]] === Dealing with Other Portability Issues The above solutions can handle most cases of writing code portable between FreeBSD and Linux. Nevertheless, with some kernel services the differences are deeper. In that case, you need to write two different handlers for those particular system calls, and use conditional assembly. Luckily, most of your code does something other than calling the kernel, so usually you will only need a few such conditional sections in your code. [[x86-portable-library]] === Using a Library You can avoid portability issues in your main code altogether by writing a library of system calls. Create a separate library for FreeBSD, a different one for Linux, and yet other libraries for more operating systems. In your library, write a separate function (or procedure, if you prefer the traditional assembly language terminology) for each system call. Use the C calling convention of passing parameters. But still use `EAX` to pass the call number in. In that case, your FreeBSD library can be very simple, as many seemingly different functions can be just labels to the same code: [.programlisting] .... sys.open: sys.close: [etc...] int 80h ret .... Your Linux library will require more different functions. But even here you can group system calls using the same number of parameters: [.programlisting] .... sys.exit: sys.close: [etc... one-parameter functions] push ebx mov ebx, [esp+12] int 80h pop ebx jmp sys.return ... sys.return: or eax, eax js sys.err clc ret sys.err: neg eax stc ret .... The library approach may seem inconvenient at first because it requires you to produce a separate file your code depends on. But it has many advantages: For one, you only need to write it once and can use it for all your programs. You can even let other assembly language programmers use it, or perhaps use one written by someone else. But perhaps the greatest advantage of the library is that your code can be ported to other systems, even by other programmers, by simply writing a new library without any changes to your code. If you do not like the idea of having a library, you can at least place all your system calls in a separate assembly language file and link it with your main program. Here, again, all porters have to do is create a new object file to link with your main program. [[x86-portable-include]] === Using an Include File If you are releasing your software as (or with) source code, you can use macros and place them in a separate file, which you include in your code. Porters of your software will simply write a new include file. No library or external object file is necessary, yet your code is portable without any need to edit the code. [NOTE] ==== This is the approach we will use throughout this chapter. We will name our include file [.filename]#system.inc#, and add to it whenever we deal with a new system call. ==== We can start our [.filename]#system.inc# by declaring the standard file descriptors: [.programlisting] .... %define stdin 0 %define stdout 1 %define stderr 2 .... Next, we create a symbolic name for each system call: [.programlisting] .... %define SYS_nosys 0 %define SYS_exit 1 %define SYS_fork 2 %define SYS_read 3 %define SYS_write 4 ; [etc...] .... We add a short, non-global procedure with a long name, so we do not accidentally reuse the name in our code: [.programlisting] .... section .text align 4 access.the.bsd.kernel: int 80h ret .... We create a macro which takes one argument, the syscall number: [.programlisting] .... %macro system 1 mov eax, %1 call access.the.bsd.kernel %endmacro .... Finally, we create macros for each syscall. These macros take no arguments. [.programlisting] .... %macro sys.exit 0 system SYS_exit %endmacro %macro sys.fork 0 system SYS_fork %endmacro %macro sys.read 0 system SYS_read %endmacro %macro sys.write 0 system SYS_write %endmacro ; [etc...] .... Go ahead, enter it into your editor and save it as [.filename]#system.inc#. We will add more to it as we discuss more syscalls. [[x86-first-program]] == Our First Program We are now ready for our first program, the mandatory Hello, World! [.programlisting] .... %include 'system.inc' section .data hello db 'Hello, World!', 0Ah hbytes equ $-hello section .text global _start _start: push dword hbytes push dword hello push dword stdout sys.write push dword 0 sys.exit .... Here is what it does: Line 1 includes the defines, the macros, and the code from [.filename]#system.inc#. Lines 3-5 are the data: Line 3 starts the data section/segment. Line 4 contains the string "Hello, World!" followed by a new line (`0Ah`). Line 5 creates a constant that contains the length of the string from line 4 in bytes. Lines 7-16 contain the code. Note that FreeBSD uses the _elf_ file format for its executables, which requires every program to start at the point labeled `_start` (or, more precisely, the linker expects that). This label has to be global. Lines 10-13 ask the system to write `hbytes` bytes of the `hello` string to `stdout`. Lines 15-16 ask the system to end the program with the return value of `0`. The `SYS_exit` syscall never returns, so the code ends there. [NOTE] ==== If you have come to UNIX(R) from MS-DOS(R) assembly language background, you may be used to writing directly to the video hardware. You will never have to worry about this in FreeBSD, or any other flavor of UNIX(R). As far as you are concerned, you are writing to a file known as [.filename]#stdout#. This can be the video screen, or a telnet terminal, or an actual file, or even the input of another program. Which one it is, is for the system to figure out. ==== [[x86-assemble-1]] === Assembling the Code Type the code in an editor, and save it in a file named [.filename]#hello.asm#. You need nasm to assemble it. [[x86-get-nasm]] ==== Installing nasm If you do not have nasm, type: [source,shell] .... % su Password:your root password # cd /usr/ports/devel/nasm # make install # exit % .... You may type `make install clean` instead of just `make install` if you do not want to keep nasm source code. Either way, FreeBSD will automatically download nasm from the Internet, compile it, and install it on your system. [NOTE] ==== If your system is not FreeBSD, you need to get nasm from its https://sourceforge.net/projects/nasm[home page]. You can still use it to assemble FreeBSD code. ==== Now you can assemble, link, and run the code: [source,shell] .... % nasm -f elf hello.asm % ld -s -o hello hello.o % ./hello Hello, World! % .... [[x86-unix-filters]] == Writing UNIX(R) Filters A common type of UNIX(R) application is a filter-a program that reads data from the [.filename]#stdin#, processes it somehow, then writes the result to [.filename]#stdout#. In this chapter, we shall develop a simple filter, and learn how to read from [.filename]#stdin# and write to [.filename]#stdout#. This filter will convert each byte of its input into a hexadecimal number followed by a blank space. [.programlisting] .... %include 'system.inc' section .data hex db '0123456789ABCDEF' buffer db 0, 0, ' ' section .text global _start _start: ; read a byte from stdin push dword 1 push dword buffer push dword stdin sys.read add esp, byte 12 or eax, eax je .done ; convert it to hex movzx eax, byte [buffer] mov edx, eax shr dl, 4 mov dl, [hex+edx] mov [buffer], dl and al, 0Fh mov al, [hex+eax] mov [buffer+1], al ; print it push dword 3 push dword buffer push dword stdout sys.write add esp, byte 12 jmp short _start .done: push dword 0 sys.exit .... In the data section we create an array called `hex`. It contains the 16 hexadecimal digits in ascending order. The array is followed by a buffer which we will use for both input and output. The first two bytes of the buffer are initially set to `0`. This is where we will write the two hexadecimal digits (the first byte also is where we will read the input). The third byte is a space. The code section consists of four parts: Reading the byte, converting it to a hexadecimal number, writing the result, and eventually exiting the program. To read the byte, we ask the system to read one byte from [.filename]#stdin#, and store it in the first byte of the `buffer`. The system returns the number of bytes read in `EAX`. This will be `1` while data is coming, or `0`, when no more input data is available. Therefore, we check the value of `EAX`. If it is `0`, we jump to `.done`, otherwise we continue. [NOTE] ==== For simplicity sake, we are ignoring the possibility of an error condition at this time. ==== The hexadecimal conversion reads the byte from the `buffer` into `EAX`, or actually just `AL`, while clearing the remaining bits of `EAX` to zeros. We also copy the byte to `EDX` because we need to convert the upper four bits (nibble) separately from the lower four bits. We store the result in the first two bytes of the buffer. Next, we ask the system to write the three bytes of the buffer, i.e., the two hexadecimal digits and the blank space, to [.filename]#stdout#. We then jump back to the beginning of the program and process the next byte. Once there is no more input left, we ask the system to exit our program, returning a zero, which is the traditional value meaning the program was successful. Go ahead, and save the code in a file named [.filename]#hex.asm#, then type the following (the `^D` means press the control key and type `D` while holding the control key down): [source,shell] .... % nasm -f elf hex.asm % ld -s -o hex hex.o % ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D % .... [NOTE] ==== If you are migrating to UNIX(R) from MS-DOS(R), you may be wondering why each line ends with `0A` instead of `0D 0A`. This is because UNIX(R) does not use the cr/lf convention, but a "new line" convention, which is `0A` in hexadecimal. ==== Can we improve this? Well, for one, it is a bit confusing because once we have converted a line of text, our input no longer starts at the beginning of the line. We can modify it to print a new line instead of a space after each `0A`: [.programlisting] .... %include 'system.inc' section .data hex db '0123456789ABCDEF' buffer db 0, 0, ' ' section .text global _start _start: mov cl, ' ' .loop: ; read a byte from stdin push dword 1 push dword buffer push dword stdin sys.read add esp, byte 12 or eax, eax je .done ; convert it to hex movzx eax, byte [buffer] mov [buffer+2], cl cmp al, 0Ah jne .hex mov [buffer+2], al .hex: mov edx, eax shr dl, 4 mov dl, [hex+edx] mov [buffer], dl and al, 0Fh mov al, [hex+eax] mov [buffer+1], al ; print it push dword 3 push dword buffer push dword stdout sys.write add esp, byte 12 jmp short .loop .done: push dword 0 sys.exit .... We have stored the space in the `CL` register. We can do this safely because, unlike Microsoft(R) Windows(R), UNIX(R) system calls do not modify the value of any register they do not use to return a value in. That means we only need to set `CL` once. We have, therefore, added a new label `.loop` and jump to it for the next byte instead of jumping at `_start`. We have also added the `.hex` label so we can either have a blank space or a new line as the third byte of the `buffer`. Once you have changed [.filename]#hex.asm# to reflect these changes, type: [source,shell] .... % nasm -f elf hex.asm % ld -s -o hex hex.o % ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D % .... That looks better. But this code is quite inefficient! We are making a system call for every single byte twice (once to read it, another time to write the output). [[x86-buffered-io]] == Buffered Input and Output We can improve the efficiency of our code by buffering our input and output. We create an input buffer and read a whole sequence of bytes at one time. Then we fetch them one by one from the buffer. We also create an output buffer. We store our output in it until it is full. At that time we ask the kernel to write the contents of the buffer to [.filename]#stdout#. The program ends when there is no more input. But we still need to ask the kernel to write the contents of our output buffer to [.filename]#stdout# one last time, otherwise some of our output would make it to the output buffer, but never be sent out. Do not forget that, or you will be wondering why some of your output is missing. [.programlisting] .... %include 'system.inc' %define BUFSIZE 2048 section .data hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text global _start _start: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword stdin sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword stdout sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret .... We now have a third section in the source code, named `.bss`. This section is not included in our executable file, and, therefore, cannot be initialized. We use `resb` instead of `db`. It simply reserves the requested size of uninitialized memory for our use. We take advantage of the fact that the system does not modify the registers: We use registers for what, otherwise, would have to be global variables stored in the `.data` section. This is also why the UNIX(R) convention of passing parameters to system calls on the stack is superior to the Microsoft convention of passing them in the registers: We can keep the registers for our own use. We use `EDI` and `ESI` as pointers to the next byte to be read from or written to. We use `EBX` and `ECX` to keep count of the number of bytes in the two buffers, so we know when to dump the output to, or read more input from, the system. Let us see how it works now: [source,shell] .... % nasm -f elf hex.asm % ld -s -o hex hex.o % ./hex Hello, World! Here I come! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D % .... Not what you expected? The program did not print the output until we pressed `^D`. That is easy to fix by inserting three lines of code to write the output every time we have converted a new line to `0A`. I have marked the three lines with > (do not copy the > in your [.filename]#hex.asm#). [.programlisting] .... %include 'system.inc' %define BUFSIZE 2048 section .data hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text global _start _start: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar > cmp al, 0Ah > jne .loop > call write jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword stdin sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword stdout sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret .... Now, let us see how it works: [source,shell] .... % nasm -f elf hex.asm % ld -s -o hex hex.o % ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D % .... Not bad for a 644-byte executable, is it! [NOTE] ==== This approach to buffered input/output still contains a hidden danger. I will discuss-and fix-it later, when I talk about the crossref:x86[x86-buffered-dark-side,dark side of buffering]. ==== [[x86-ungetc]] === How to Unread a Character [WARNING] ==== This may be a somewhat advanced topic, mostly of interest to programmers familiar with the theory of compilers. If you wish, you may crossref:x86[x86-command-line,skip to the next section], and perhaps read this later. ==== While our sample program does not require it, more sophisticated filters often need to look ahead. In other words, they may need to see what the next character is (or even several characters). If the next character is of a certain value, it is part of the token currently being processed. Otherwise, it is not. For example, you may be parsing the input stream for a textual string (e.g., when implementing a language compiler): If a character is followed by another character, or perhaps a digit, it is part of the token you are processing. If it is followed by white space, or some other value, then it is not part of the current token. This presents an interesting problem: How to return the next character back to the input stream, so it can be read again later? One possible solution is to store it in a character variable, then set a flag. We can modify `getchar` to check the flag, and if it is set, fetch the byte from that variable instead of the input buffer, and reset the flag. But, of course, that slows us down. The C language has an `ungetc()` function, just for that purpose. Is there a quick way to implement it in our code? I would like you to scroll back up and take a look at the `getchar` procedure and see if you can find a nice and fast solution before reading the next paragraph. Then come back here and see my own solution. The key to returning a character back to the stream is in how we are getting the characters to start with: First we check if the buffer is empty by testing the value of `EBX`. If it is zero, we call the `read` procedure. If we do have a character available, we use `lodsb`, then decrease the value of `EBX`. The `lodsb` instruction is effectively identical to: [.programlisting] .... mov al, [esi] inc esi .... The byte we have fetched remains in the buffer until the next time `read` is called. We do not know when that happens, but we do know it will not happen until the next call to `getchar`. Hence, to "return" the last-read byte back to the stream, all we have to do is decrease the value of `ESI` and increase the value of `EBX`: [.programlisting] .... ungetc: dec esi inc ebx ret .... But, be careful! We are perfectly safe doing this if our look-ahead is at most one character at a time. If we are examining more than one upcoming character and call `ungetc` several times in a row, it will work most of the time, but not all the time (and will be tough to debug). Why? Because as long as `getchar` does not have to call `read`, all of the pre-read bytes are still in the buffer, and our `ungetc` works without a glitch. But the moment `getchar` calls `read`, the contents of the buffer change. We can always rely on `ungetc` working properly on the last character we have read with `getchar`, but not on anything we have read before that. If your program reads more than one byte ahead, you have at least two choices: If possible, modify the program so it only reads one byte ahead. This is the simplest solution. If that option is not available, first of all determine the maximum number of characters your program needs to return to the input stream at one time. Increase that number slightly, just to be sure, preferably to a multiple of 16-so it aligns nicely. Then modify the `.bss` section of your code, and create a small "spare" buffer right before your input buffer, something like this: [.programlisting] .... section .bss resb 16 ; or whatever the value you came up with ibuffer resb BUFSIZE obuffer resb BUFSIZE .... You also need to modify your `ungetc` to pass the value of the byte to unget in `AL`: [.programlisting] .... ungetc: dec esi inc ebx mov [esi], al ret .... With this modification, you can call `ungetc` up to 17 times in a row safely (the first call will still be within the buffer, the remaining 16 may be either within the buffer or within the "spare"). [[x86-command-line]] == Command Line Arguments Our hex program will be more useful if it can read the names of an input and output file from its command line, i.e., if it can process the command line arguments. But... Where are they? Before a UNIX(R) system starts a program, it ``push``es some data on the stack, then jumps at the `_start` label of the program. Yes, I said jumps, not calls. That means the data can be accessed by reading `[esp+offset]`, or by simply ``pop``ping it. The value at the top of the stack contains the number of command line arguments. It is traditionally called `argc`, for "argument count." Command line arguments follow next, all `argc` of them. These are typically referred to as `argv`, for "argument value(s)." That is, we get `argv[0]`, `argv[1]`, `...`, `argv[argc-1]`. These are not the actual arguments, but pointers to arguments, i.e., memory addresses of the actual arguments. The arguments themselves are NUL-terminated character strings. The `argv` list is followed by a NULL pointer, which is simply a `0`. There is more, but this is enough for our purposes right now. [NOTE] ==== If you have come from the MS-DOS(R) programming environment, the main difference is that each argument is in a separate string. The second difference is that there is no practical limit on how many arguments there can be. ==== Armed with this knowledge, we are almost ready for the next version of [.filename]#hex.asm#. First, however, we need to add a few lines to [.filename]#system.inc#: First, we need to add two new entries to our list of system call numbers: [.programlisting] .... %define SYS_open 5 %define SYS_close 6 .... Then we add two new macros at the end of the file: [.programlisting] .... %macro sys.open 0 system SYS_open %endmacro %macro sys.close 0 system SYS_close %endmacro .... Here, then, is our modified source code: [.programlisting] .... %include 'system.inc' %define BUFSIZE 2048 section .data fd.in dd stdin fd.out dd stdout hex db '0123456789ABCDEF' section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text align 4 err: push dword 1 ; return failure sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] pop ecx jecxz .init ; no more arguments ; ECX contains the path to input file push dword 0 ; O_RDONLY push ecx sys.open jc err ; open failed add esp, byte 8 mov [fd.in], eax pop ecx jecxz .init ; no more arguments ; ECX contains the path to output file push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc err add esp, byte 12 mov [fd.out], eax .init: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer .loop: ; read a byte from input file or stdin call getchar ; convert it to hex mov dl, al shr al, 4 mov al, [hex+eax] call putchar mov al, dl and al, 0Fh mov al, [hex+eax] call putchar mov al, ' ' cmp dl, 0Ah jne .put mov al, dl .put: call putchar cmp al, dl jne .loop call write jmp short .loop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close ; return success push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now ret .... In our `.data` section we now have two new variables, `fd.in` and `fd.out`. We store the input and output file descriptors here. In the `.text` section we have replaced the references to `stdin` and `stdout` with `[fd.in]` and `[fd.out]`. The `.text` section now starts with a simple error handler, which does nothing but exit the program with a return value of `1`. The error handler is before `_start` so we are within a short distance from where the errors occur. Naturally, the program execution still begins at `_start`. First, we remove `argc` and `argv[0]` from the stack: They are of no interest to us (in this program, that is). We pop `argv[1]` to `ECX`. This register is particularly suited for pointers, as we can handle NULL pointers with `jecxz`. If `argv[1]` is not NULL, we try to open the file named in the first argument. Otherwise, we continue the program as before: Reading from `stdin`, writing to `stdout`. If we fail to open the input file (e.g., it does not exist), we jump to the error handler and quit. If all went well, we now check for the second argument. If it is there, we open the output file. Otherwise, we send the output to `stdout`. If we fail to open the output file (e.g., it exists and we do not have the write permission), we, again, jump to the error handler. The rest of the code is the same as before, except we close the input and output files before exiting, and, as mentioned, we use `[fd.in]` and `[fd.out]`. Our executable is now a whopping 768 bytes long. Can we still improve it? Of course! Every program can be improved. Here are a few ideas of what we could do: * Have our error handler print a message to `stderr`. * Add error handlers to the `read` and `write` functions. * Close `stdin` when we open an input file, `stdout` when we open an output file. * Add command line switches, such as `-i` and `-o`, so we can list the input and output files in any order, or perhaps read from `stdin` and write to a file. * Print a usage message if command line arguments are incorrect. I shall leave these enhancements as an exercise to the reader: You already know everything you need to know to implement them. [[x86-environment]] == UNIX(R) Environment An important UNIX(R) concept is the environment, which is defined by _environment variables_. Some are set by the system, others by you, yet others by the shell, or any program that loads another program. [[x86-find-environment]] === How to Find Environment Variables I said earlier that when a program starts executing, the stack contains `argc` followed by the NULL-terminated `argv` array, followed by something else. The "something else" is the _environment_, or, to be more precise, a NULL-terminated array of pointers to _environment variables_. This is often referred to as `env`. The structure of `env` is the same as that of `argv`, a list of memory addresses followed by a NULL (`0`). In this case, there is no `"envc"`-we figure out where the array ends by searching for the final NULL. The variables usually come in the `name=value` format, but sometimes the `=value` part may be missing. We need to account for that possibility. [[x86-webvar]] === webvars I could just show you some code that prints the environment the same way the UNIX(R) env command does. But I thought it would be more interesting to write a simple assembly language CGI utility. [[x86-cgi]] ==== CGI: a Quick Overview I have a http://www.whizkidtech.redprince.net/cgi-bin/tutorial[detailed CGI tutorial] on my web site, but here is a very quick overview of CGI: * The web server communicates with the CGI program by setting _environment variables_. * The CGI program sends its output to [.filename]#stdout#. The web server reads it from there. * It must start with an HTTP header followed by two blank lines. * It then prints the HTML code, or whatever other type of data it is producing. [NOTE] ==== While certain _environment variables_ use standard names, others vary, depending on the web server. That makes webvars quite a useful diagnostic tool. ==== [[x86-webvars-the-code]] ==== The Code Our webvars program, then, must send out the HTTP header followed by some HTML mark-up. It then must read the _environment variables_ one by one and send them out as part of the HTML page. The code follows. I placed comments and explanations right inside the code: [.programlisting] .... ;;;;;;; webvars.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Copyright (c) 2000 G. Adam Stanislav ; All rights reserved. ; ; Redistribution and use in source and binary forms, with or without ; modification, are permitted provided that the following conditions ; are met: ; 1. Redistributions of source code must retain the above copyright ; notice, this list of conditions and the following disclaimer. ; 2. Redistributions in binary form must reproduce the above copyright ; notice, this list of conditions and the following disclaimer in the ; documentation and/or other materials provided with the distribution. ; ; THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ; ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE ; IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ; ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE ; FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS ; OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ; HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT ; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY ; OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ; SUCH DAMAGE. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Version 1.0 ; ; Started: 8-Dec-2000 ; Updated: 8-Dec-2000 ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' section .data http db 'Content-type: text/html', 0Ah, 0Ah db '', 0Ah db '', 0Ah db '', 0Ah db '', 0Ah db 'Web Environment', 0Ah db '', 0Ah db '', 0Ah, 0Ah db '', 0Ah db '
', 0Ah db '

Web Environment

', 0Ah db '

The following environment variables are defined ' db 'on this web server:

', 0Ah, 0Ah db '', 0Ah httplen equ $-http left db '', 0Ah db '', 0Ah db '', 0Ah db '', 0Ah rightlen equ $-right wrap db '
' leftlen equ $-left middle db '' midlen equ $-middle undef db '(undefined)' undeflen equ $-undef right db '
', 0Ah db '
', 0Ah db '', 0Ah db '', 0Ah, 0Ah wraplen equ $-wrap section .text global _start _start: ; First, send out all the http and xhtml stuff that is ; needed before we start showing the environment push dword httplen push dword http push dword stdout sys.write ; Now find how far on the stack the environment pointers ; are. We have 12 bytes we have pushed before "argc" mov eax, [esp+12] ; We need to remove the following from the stack: ; ; The 12 bytes we pushed for sys.write ; The 4 bytes of argc ; The EAX*4 bytes of argv ; The 4 bytes of the NULL after argv ; ; Total: ; 20 + eax * 4 ; ; Because stack grows down, we need to ADD that many bytes ; to ESP. lea esp, [esp+20+eax*4] cld ; This should already be the case, but let's be sure. ; Loop through the environment, printing it out .loop: pop edi or edi, edi ; Done yet? je near .wrap ; Print the left part of HTML push dword leftlen push dword left push dword stdout sys.write ; It may be tempting to search for the '=' in the env string next. ; But it is possible there is no '=', so we search for the ; terminating NUL first. mov esi, edi ; Save start of string sub ecx, ecx not ecx ; ECX = FFFFFFFF sub eax, eax repne scasb not ecx ; ECX = string length + 1 mov ebx, ecx ; Save it in EBX ; Now is the time to find '=' mov edi, esi ; Start of string mov al, '=' repne scasb not ecx add ecx, ebx ; Length of name push ecx push esi push dword stdout sys.write ; Print the middle part of HTML table code push dword midlen push dword middle push dword stdout sys.write ; Find the length of the value not ecx lea ebx, [ebx+ecx-1] ; Print "undefined" if 0 or ebx, ebx jne .value mov ebx, undeflen mov edi, undef .value: push ebx push edi push dword stdout sys.write ; Print the right part of the table row push dword rightlen push dword right push dword stdout sys.write ; Get rid of the 60 bytes we have pushed add esp, byte 60 ; Get the next variable jmp .loop .wrap: ; Print the rest of HTML push dword wraplen push dword wrap push dword stdout sys.write ; Return success push dword 0 sys.exit .... This code produces a 1,396-byte executable. Most of it is data, i.e., the HTML mark-up we need to send out. Assemble and link it as usual: [source,shell] .... % nasm -f elf webvars.asm % ld -s -o webvars webvars.o .... To use it, you need to upload [.filename]#webvars# to your web server. Depending on how your web server is set up, you may have to store it in a special [.filename]#cgi-bin# directory, or perhaps rename it with a [.filename]#.cgi# extension. Then you need to use your browser to view its output. To see its output on my web server, please go to http://www.int80h.org/webvars/[http://www.int80h.org/webvars/]. If curious about the additional environment variables present in a password protected web directory, go to http://www.int80h.org/private/[http://www.int80h.org/private/], using the name `asm` and password `programmer`. [[x86-files]] == Working with Files We have already done some basic file work: We know how to open and close them, how to read and write them using buffers. But UNIX(R) offers much more functionality when it comes to files. We will examine some of it in this section, and end up with a nice file conversion utility. Indeed, let us start at the end, that is, with the file conversion utility. It always makes programming easier when we know from the start what the end product is supposed to do. One of the first programs I wrote for UNIX(R) was link:ftp://ftp.int80h.org/unix/tuc/[tuc], a text-to-UNIX(R) file converter. It converts a text file from other operating systems to a UNIX(R) text file. In other words, it changes from different kind of line endings to the newline convention of UNIX(R). It saves the output in a different file. Optionally, it converts a UNIX(R) text file to a DOS text file. I have used tuc extensively, but always only to convert from some other OS to UNIX(R), never the other way. I have always wished it would just overwrite the file instead of me having to send the output to a different file. Most of the time, I end up using it like this: [source,shell] .... % tuc myfile tempfile % mv tempfile myfile .... It would be nice to have a ftuc, i.e., _fast tuc_, and use it like this: [source,shell] .... % ftuc myfile .... In this chapter, then, we will write ftuc in assembly language (the original tuc is in C), and study various file-oriented kernel services in the process. At first sight, such a file conversion is very simple: All you have to do is strip the carriage returns, right? If you answered yes, think again: That approach will work most of the time (at least with MS DOS text files), but will fail occasionally. The problem is that not all non UNIX(R) text files end their line with the carriage return / line feed sequence. Some use carriage returns without line feeds. Others combine several blank lines into a single carriage return followed by several line feeds. And so on. A text file converter, then, must be able to handle any possible line endings: * carriage return / line feed * carriage return * line feed / carriage return * line feed It should also handle files that use some kind of a combination of the above (e.g., carriage return followed by several line feeds). [[x86-finite-state-machine]] === Finite State Machine The problem is easily solved by the use of a technique called _finite state machine_, originally developed by the designers of digital electronic circuits. A _finite state machine_ is a digital circuit whose output is dependent not only on its input but on its previous input, i.e., on its state. The microprocessor is an example of a _finite state machine_: Our assembly language code is assembled to machine language in which some assembly language code produces a single byte of machine language, while others produce several bytes. As the microprocessor fetches the bytes from the memory one by one, some of them simply change its state rather than produce some output. When all the bytes of the op code are fetched, the microprocessor produces some output, or changes the value of a register, etc. Because of that, all software is essentially a sequence of state instructions for the microprocessor. Nevertheless, the concept of _finite state machine_ is useful in software design as well. Our text file converter can be designer as a _finite state machine_ with three possible states. We could call them states 0-2, but it will make our life easier if we give them symbolic names: * ordinary * cr * lf Our program will start in the ordinary state. During this state, the program action depends on its input as follows: * If the input is anything other than a carriage return or line feed, the input is simply passed on to the output. The state remains unchanged. * If the input is a carriage return, the state is changed to cr. The input is then discarded, i.e., no output is made. * If the input is a line feed, the state is changed to lf. The input is then discarded. Whenever we are in the cr state, it is because the last input was a carriage return, which was unprocessed. What our software does in this state again depends on the current input: * If the input is anything other than a carriage return or line feed, output a line feed, then output the input, then change the state to ordinary. * If the input is a carriage return, we have received two (or more) carriage returns in a row. We discard the input, we output a line feed, and leave the state unchanged. * If the input is a line feed, we output the line feed and change the state to ordinary. Note that this is not the same as the first case above - if we tried to combine them, we would be outputting two line feeds instead of one. Finally, we are in the lf state after we have received a line feed that was not preceded by a carriage return. This will happen when our file already is in UNIX(R) format, or whenever several lines in a row are expressed by a single carriage return followed by several line feeds, or when line ends with a line feed / carriage return sequence. Here is how we need to handle our input in this state: * If the input is anything other than a carriage return or line feed, we output a line feed, then output the input, then change the state to ordinary. This is exactly the same action as in the cr state upon receiving the same kind of input. * If the input is a carriage return, we discard the input, we output a line feed, then change the state to ordinary. * If the input is a line feed, we output the line feed, and leave the state unchanged. [[x86-final-state]] ==== The Final State The above _finite state machine_ works for the entire file, but leaves the possibility that the final line end will be ignored. That will happen whenever the file ends with a single carriage return or a single line feed. I did not think of it when I wrote tuc, just to discover that occasionally it strips the last line ending. This problem is easily fixed by checking the state after the entire file was processed. If the state is not ordinary, we simply need to output one last line feed. [NOTE] ==== Now that we have expressed our algorithm as a _finite state machine_, we could easily design a dedicated digital electronic circuit (a "chip") to do the conversion for us. Of course, doing so would be considerably more expensive than writing an assembly language program. ==== [[x86-tuc-counter]] ==== The Output Counter Because our file conversion program may be combining two characters into one, we need to use an output counter. We initialize it to `0`, and increase it every time we send a character to the output. At the end of the program, the counter will tell us what size we need to set the file to. [[x86-software-fsm]] === Implementing FSM in Software The hardest part of working with a _finite state machine_ is analyzing the problem and expressing it as a _finite state machine_. That accomplished, the software almost writes itself. In a high-level language, such as C, there are several main approaches. One is to use a `switch` statement which chooses what function should be run. For example, [.programlisting] .... switch (state) { default: case REGULAR: regular(inputchar); break; case CR: cr(inputchar); break; case LF: lf(inputchar); break; } .... Another approach is by using an array of function pointers, something like this: [.programlisting] .... (output[state])(inputchar); .... Yet another is to have `state` be a function pointer, set to point at the appropriate function: [.programlisting] .... (*state)(inputchar); .... This is the approach we will use in our program because it is very easy to do in assembly language, and very fast, too. We will simply keep the address of the right procedure in `EBX`, and then just issue: [.programlisting] .... call ebx .... This is possibly faster than hardcoding the address in the code because the microprocessor does not have to fetch the address from the memory-it is already stored in one of its registers. I said _possibly_ because with the caching modern microprocessors do, either way may be equally fast. [[memory-mapped-files]] === Memory Mapped Files Because our program works on a single file, we cannot use the approach that worked for us before, i.e., to read from an input file and to write to an output file. UNIX(R) allows us to map a file, or a section of a file, into memory. To do that, we first need to open the file with the appropriate read/write flags. Then we use the `mmap` system call to map it into the memory. One nice thing about `mmap` is that it automatically works with virtual memory: We can map more of the file into the memory than we have physical memory available, yet still access it through regular memory op codes, such as `mov`, `lods`, and `stos`. Whatever changes we make to the memory image of the file will be written to the file by the system. We do not even have to keep the file open: As long as it stays mapped, we can read from it and write to it. The 32-bit Intel microprocessors can access up to four gigabytes of memory - physical or virtual. The FreeBSD system allows us to use up to a half of it for file mapping. For simplicity sake, in this tutorial we will only convert files that can be mapped into the memory in their entirety. There are probably not too many text files that exceed two gigabytes in size. If our program encounters one, it will simply display a message suggesting we use the original tuc instead. If you examine your copy of [.filename]#syscalls.master#, you will find two separate syscalls named `mmap`. This is because of evolution of UNIX(R): There was the traditional BSD `mmap`, syscall 71. That one was superseded by the POSIX(R) `mmap`, syscall 197. The FreeBSD system supports both because older programs were written by using the original BSD version. But new software uses the POSIX(R) version, which is what we will use. The [.filename]#syscalls.master# lists the POSIX(R) version like this: [.programlisting] .... 197 STD BSD { caddr_t mmap(caddr_t addr, size_t len, int prot, \ int flags, int fd, long pad, off_t pos); } .... This differs slightly from what man:mmap[2] says. That is because man:mmap[2] describes the C version. The difference is in the `long pad` argument, which is not present in the C version. However, the FreeBSD syscalls add a 32-bit pad after ``push``ing a 64-bit argument. In this case, `off_t` is a 64-bit value. When we are finished working with a memory-mapped file, we unmap it with the `munmap` syscall: [TIP] ==== For an in-depth treatment of `mmap`, see W. Richard Stevens' http://www.int80h.org/cgi-bin/isbn?isbn=0130810819[Unix Network Programming, Volume 2, Chapter 12]. ==== [[x86-file-size]] === Determining File Size Because we need to tell `mmap` how many bytes of the file to map into the memory, and because we want to map the entire file, we need to determine the size of the file. We can use the `fstat` syscall to get all the information about an open file that the system can give us. That includes the file size. Again, [.filename]#syscalls.master# lists two versions of `fstat`, a traditional one (syscall 62), and a POSIX(R) one (syscall 189). Naturally, we will use the POSIX(R) version: [.programlisting] .... 189 STD POSIX { int fstat(int fd, struct stat *sb); } .... This is a very straightforward call: We pass to it the address of a `stat` structure and the descriptor of an open file. It will fill out the contents of the `stat` structure. I do, however, have to say that I tried to declare the `stat` structure in the `.bss` section, and `fstat` did not like it: It set the carry flag indicating an error. After I changed the code to allocate the structure on the stack, everything was working fine. [[x86-ftruncate]] === Changing the File Size Because our program may combine carriage return / line feed sequences into straight line feeds, our output may be smaller than our input. However, since we are placing our output into the same file we read the input from, we may have to change the size of the file. The `ftruncate` system call allows us to do just that. Despite its somewhat misleading name, the `ftruncate` system call can be used to both truncate the file (make it smaller) and to grow it. And yes, we will find two versions of `ftruncate` in [.filename]#syscalls.master#, an older one (130), and a newer one (201). We will use the newer one: [.programlisting] .... 201 STD BSD { int ftruncate(int fd, int pad, off_t length); } .... Please note that this one contains a `int pad` again. [[x86-ftuc]] === ftuc We now know everything we need to write ftuc. We start by adding some new lines in [.filename]#system.inc#. First, we define some constants and structures, somewhere at or near the beginning of the file: [.programlisting] .... ;;;;;;; open flags %define O_RDONLY 0 %define O_WRONLY 1 %define O_RDWR 2 ;;;;;;; mmap flags %define PROT_NONE 0 %define PROT_READ 1 %define PROT_WRITE 2 %define PROT_EXEC 4 ;; %define MAP_SHARED 0001h %define MAP_PRIVATE 0002h ;;;;;;; stat structure struc stat st_dev resd 1 ; = 0 st_ino resd 1 ; = 4 st_mode resw 1 ; = 8, size is 16 bits st_nlink resw 1 ; = 10, ditto st_uid resd 1 ; = 12 st_gid resd 1 ; = 16 st_rdev resd 1 ; = 20 st_atime resd 1 ; = 24 st_atimensec resd 1 ; = 28 st_mtime resd 1 ; = 32 st_mtimensec resd 1 ; = 36 st_ctime resd 1 ; = 40 st_ctimensec resd 1 ; = 44 st_size resd 2 ; = 48, size is 64 bits st_blocks resd 2 ; = 56, ditto st_blksize resd 1 ; = 64 st_flags resd 1 ; = 68 st_gen resd 1 ; = 72 st_lspare resd 1 ; = 76 st_qspare resd 4 ; = 80 endstruc .... We define the new syscalls: [.programlisting] .... %define SYS_mmap 197 %define SYS_munmap 73 %define SYS_fstat 189 %define SYS_ftruncate 201 .... We add the macros for their use: [.programlisting] .... %macro sys.mmap 0 system SYS_mmap %endmacro %macro sys.munmap 0 system SYS_munmap %endmacro %macro sys.ftruncate 0 system SYS_ftruncate %endmacro %macro sys.fstat 0 system SYS_fstat %endmacro .... And here is our code: [.programlisting] .... ;;;;;;; Fast Text-to-Unix Conversion (ftuc.asm) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Started: 21-Dec-2000 ;; Updated: 22-Dec-2000 ;; ;; Copyright 2000 G. Adam Stanislav. ;; All rights reserved. ;; ;;;;;;; v.1 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' section .data db 'Copyright 2000 G. Adam Stanislav.', 0Ah db 'All rights reserved.', 0Ah usg db 'Usage: ftuc filename', 0Ah usglen equ $-usg co db "ftuc: Can't open file.", 0Ah colen equ $-co fae db 'ftuc: File access error.', 0Ah faelen equ $-fae ftl db 'ftuc: File too long, use regular tuc instead.', 0Ah ftllen equ $-ftl mae db 'ftuc: Memory allocation error.', 0Ah maelen equ $-mae section .text align 4 memerr: push dword maelen push dword mae jmp short error align 4 toolong: push dword ftllen push dword ftl jmp short error align 4 facerr: push dword faelen push dword fae jmp short error align 4 cantopen: push dword colen push dword co jmp short error align 4 usage: push dword usglen push dword usg error: push dword stderr sys.write push dword 1 sys.exit align 4 global _start _start: pop eax ; argc pop eax ; program name pop ecx ; file to convert jecxz usage pop eax or eax, eax ; Too many arguments? jne usage ; Open the file push dword O_RDWR push ecx sys.open jc cantopen mov ebp, eax ; Save fd sub esp, byte stat_size mov ebx, esp ; Find file size push ebx push ebp ; fd sys.fstat jc facerr mov edx, [ebx + st_size + 4] ; File is too long if EDX != 0 ... or edx, edx jne near toolong mov ecx, [ebx + st_size] ; ... or if it is above 2 GB or ecx, ecx js near toolong ; Do nothing if the file is 0 bytes in size jecxz .quit ; Map the entire file in memory push edx push edx ; starting at offset 0 push edx ; pad push ebp ; fd push dword MAP_SHARED push dword PROT_READ | PROT_WRITE push ecx ; entire file size push edx ; let system decide on the address sys.mmap jc near memerr mov edi, eax mov esi, eax push ecx ; for SYS_munmap push edi ; Use EBX for state machine mov ebx, ordinary mov ah, 0Ah cld .loop: lodsb call ebx loop .loop cmp ebx, ordinary je .filesize ; Output final lf mov al, ah stosb inc edx .filesize: ; truncate file to new size push dword 0 ; high dword push edx ; low dword push eax ; pad push ebp sys.ftruncate ; close it (ebp still pushed) sys.close add esp, byte 16 sys.munmap .quit: push dword 0 sys.exit align 4 ordinary: cmp al, 0Dh je .cr cmp al, ah je .lf stosb inc edx ret align 4 .cr: mov ebx, cr ret align 4 .lf: mov ebx, lf ret align 4 cr: cmp al, 0Dh je .cr cmp al, ah je .lf xchg al, ah stosb inc edx xchg al, ah ; fall through .lf: stosb inc edx mov ebx, ordinary ret align 4 .cr: mov al, ah stosb inc edx ret align 4 lf: cmp al, ah je .lf cmp al, 0Dh je .cr xchg al, ah stosb inc edx xchg al, ah stosb inc edx mov ebx, ordinary ret align 4 .cr: mov ebx, ordinary mov al, ah ; fall through .lf: stosb inc edx ret .... [WARNING] ==== Do not use this program on files stored on a disk formatted by MS-DOS(R) or Windows(R). There seems to be a subtle bug in the FreeBSD code when using `mmap` on these drives mounted under FreeBSD: If the file is over a certain size, `mmap` will just fill the memory with zeros, and then copy them to the file overwriting its contents. ==== [[x86-one-pointed-mind]] == One-Pointed Mind As a student of Zen, I like the idea of a one-pointed mind: Do one thing at a time, and do it well. This, indeed, is very much how UNIX(R) works as well. While a typical Windows(R) application is attempting to do everything imaginable (and is, therefore, riddled with bugs), a typical UNIX(R) program does only one thing, and it does it well. The typical UNIX(R) user then essentially assembles his own applications by writing a shell script which combines the various existing programs by piping the output of one program to the input of another. When writing your own UNIX(R) software, it is generally a good idea to see what parts of the problem you need to solve can be handled by existing programs, and only write your own programs for that part of the problem that you do not have an existing solution for. [[x86-csv]] === CSV I will illustrate this principle with a specific real-life example I was faced with recently: I needed to extract the 11th field of each record from a database I downloaded from a web site. The database was a CSV file, i.e., a list of _comma-separated values_. That is quite a standard format for sharing data among people who may be using different database software. The first line of the file contains the list of various fields separated by commas. The rest of the file contains the data listed line by line, with values separated by commas. I tried awk, using the comma as a separator. But because several lines contained a quoted comma, awk was extracting the wrong field from those lines. Therefore, I needed to write my own software to extract the 11th field from the CSV file. However, going with the UNIX(R) spirit, I only needed to write a simple filter that would do the following: * Remove the first line from the file; * Change all unquoted commas to a different character; * Remove all quotation marks. Strictly speaking, I could use sed to remove the first line from the file, but doing so in my own program was very easy, so I decided to do it and reduce the size of the pipeline. At any rate, writing a program like this took me about 20 minutes. Writing a program that extracts the 11th field from the CSV file would take a lot longer, and I could not reuse it to extract some other field from some other database. This time I decided to let it do a little more work than a typical tutorial program would: * It parses its command line for options; * It displays proper usage if it finds wrong arguments; * It produces meaningful error messages. Here is its usage message: [source,shell] .... Usage: csv [-t] [-c] [-p] [-o ] [-i ] .... All parameters are optional, and can appear in any order. The `-t` parameter declares what to replace the commas with. The `tab` is the default here. For example, `-t;` will replace all unquoted commas with semicolons. I did not need the `-c` option, but it may come in handy in the future. It lets me declare that I want a character other than a comma replaced with something else. For example, `-c@` will replace all at signs (useful if you want to split a list of email addresses to their user names and domains). The `-p` option preserves the first line, i.e., it does not delete it. By default, we delete the first line because in a CSV file it contains the field names rather than data. The `-i` and `-o` options let me specify the input and the output files. Defaults are [.filename]#stdin# and [.filename]#stdout#, so this is a regular UNIX(R) filter. I made sure that both `-i filename` and `-ifilename` are accepted. I also made sure that only one input and one output files may be specified. To get the 11th field of each record, I can now do: [source,shell] .... % csv '-t;' data.csv | awk '-F;' '{print $11}' .... The code stores the options (except for the file descriptors) in `EDX`: The comma in `DH`, the new separator in `DL`, and the flag for the `-p` option in the highest bit of `EDX`, so a check for its sign will give us a quick decision what to do. Here is the code: [.programlisting] .... ;;;;;;; csv.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Convert a comma-separated file to a something-else separated file. ; ; Started: 31-May-2001 ; Updated: 1-Jun-2001 ; ; Copyright (c) 2001 G. Adam Stanislav ; All rights reserved. ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' %define BUFSIZE 2048 section .data fd.in dd stdin fd.out dd stdout usg db 'Usage: csv [-t] [-c] [-p] [-o ] [-i ]', 0Ah usglen equ $-usg iemsg db "csv: Can't open input file", 0Ah iemlen equ $-iemsg oemsg db "csv: Can't create output file", 0Ah oemlen equ $-oemsg section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE section .text align 4 ierr: push dword iemlen push dword iemsg push dword stderr sys.write push dword 1 ; return failure sys.exit align 4 oerr: push dword oemlen push dword oemsg push dword stderr sys.write push dword 2 sys.exit align 4 usage: push dword usglen push dword usg push dword stderr sys.write push dword 3 sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] mov edx, (',' << 8) | 9 .arg: pop ecx or ecx, ecx je near .init ; no more arguments ; ECX contains the pointer to an argument cmp byte [ecx], '-' jne usage inc ecx mov ax, [ecx] .o: cmp al, 'o' jne .i ; Make sure we are not asked for the output file twice cmp dword [fd.out], stdout jne usage ; Find the path to output file - it is either at [ECX+1], ; i.e., -ofile -- ; or in the next argument, ; i.e., -o file inc ecx or ah, ah jne .openoutput pop ecx jecxz usage .openoutput: push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc near oerr add esp, byte 12 mov [fd.out], eax jmp short .arg .i: cmp al, 'i' jne .p ; Make sure we are not asked twice cmp dword [fd.in], stdin jne near usage ; Find the path to the input file inc ecx or ah, ah jne .openinput pop ecx or ecx, ecx je near usage .openinput: push dword 0 ; O_RDONLY push ecx sys.open jc near ierr ; open failed add esp, byte 8 mov [fd.in], eax jmp .arg .p: cmp al, 'p' jne .t or ah, ah jne near usage or edx, 1 << 31 jmp .arg .t: cmp al, 't' ; redefine output delimiter jne .c or ah, ah je near usage mov dl, ah jmp .arg .c: cmp al, 'c' jne near usage or ah, ah je near usage mov dh, ah jmp .arg align 4 .init: sub eax, eax sub ebx, ebx sub ecx, ecx mov edi, obuffer ; See if we are to preserve the first line or edx, edx js .loop .firstline: ; get rid of the first line call getchar cmp al, 0Ah jne .firstline .loop: ; read a byte from stdin call getchar ; is it a comma (or whatever the user asked for)? cmp al, dh jne .quote ; Replace the comma with a tab (or whatever the user wants) mov al, dl .put: call putchar jmp short .loop .quote: cmp al, '"' jne .put ; Print everything until you get another quote or EOL. If it ; is a quote, skip it. If it is EOL, print it. .qloop: call getchar cmp al, '"' je .loop cmp al, 0Ah je .put call putchar jmp short .qloop align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx ret read: jecxz .read call write .read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .done sub eax, eax ret align 4 .done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close ; return success push dword 0 sys.exit align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: jecxz .ret ; nothing to write sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now .ret: ret .... Much of it is taken from [.filename]#hex.asm# above. But there is one important difference: I no longer call `write` whenever I am outputting a line feed. Yet, the code can be used interactively. I have found a better solution for the interactive problem since I first started writing this chapter. I wanted to make sure each line is printed out separately only when needed. After all, there is no need to flush out every line when used non-interactively. The new solution I use now is to call `write` every time I find the input buffer empty. That way, when running in the interactive mode, the program reads one line from the user's keyboard, processes it, and sees its input buffer is empty. It flushes its output and reads the next line. [[x86-buffered-dark-side]] ==== The Dark Side of Buffering This change prevents a mysterious lockup in a very specific case. I refer to it as the _dark side of buffering_, mostly because it presents a danger that is not quite obvious. It is unlikely to happen with a program like the csv above, so let us consider yet another filter: In this case we expect our input to be raw data representing color values, such as the _red_, _green_, and _blue_ intensities of a pixel. Our output will be the negative of our input. Such a filter would be very simple to write. Most of it would look just like all the other filters we have written so far, so I am only going to show you its inner loop: [.programlisting] .... .loop: call getchar not al ; Create a negative call putchar jmp short .loop .... Because this filter works with raw data, it is unlikely to be used interactively. But it could be called by image manipulation software. And, unless it calls `write` before each call to `read`, chances are it will lock up. Here is what might happen: [.procedure] . The image editor will load our filter using the C function `popen()`. . It will read the first row of pixels from a bitmap or pixmap. . It will write the first row of pixels to the _pipe_ leading to the `fd.in` of our filter. . Our filter will read each pixel from its input, turn it to a negative, and write it to its output buffer. . Our filter will call `getchar` to fetch the next pixel. . `getchar` will find an empty input buffer, so it will call `read`. . `read` will call the `SYS_read` system call. . The _kernel_ will suspend our filter until the image editor sends more data to the pipe. . The image editor will read from the other pipe, connected to the `fd.out` of our filter so it can set the first row of the output image _before_ it sends us the second row of the input. . The _kernel_ suspends the image editor until it receives some output from our filter, so it can pass it on to the image editor. At this point our filter waits for the image editor to send it more data to process, while the image editor is waiting for our filter to send it the result of the processing of the first row. But the result sits in our output buffer. The filter and the image editor will continue waiting for each other forever (or, at least, until they are killed). Our software has just entered a crossref:secure[secure-race-conditions,race condition]. This problem does not exist if our filter flushes its output buffer _before_ asking the _kernel_ for more input data. [[x86-fpu]] == Using the FPU Strangely enough, most of assembly language literature does not even mention the existence of the FPU, or _floating point unit_, let alone discuss programming it. Yet, never does assembly language shine more than when we create highly optimized FPU code by doing things that can be done _only_ in assembly language. [[x86-fpu-organization]] === Organization of the FPU The FPU consists of 8 80-bit floating-point registers. These are organized in a stack fashion-you can `push` a value on TOS (_top of stack_) and you can `pop` it. That said, the assembly language op codes are not `push` and `pop` because those are already taken. You can `push` a value on TOS by using `fld`, `fild`, and `fbld`. Several other op codes let you `push` many common _constants_-such as _pi_-on the TOS. Similarly, you can `pop` a value by using `fst`, `fstp`, `fist`, `fistp`, and `fbstp`. Actually, only the op codes that end with a _p_ will literally `pop` the value, the rest will `store` it somewhere else without removing it from the TOS. We can transfer the data between the TOS and the computer memory either as a 32-bit, 64-bit, or 80-bit _real_, a 16-bit, 32-bit, or 64-bit _integer_, or an 80-bit _packed decimal_. The 80-bit _packed decimal_ is a special case of _binary coded decimal_ which is very convenient when converting between the ASCII representation of data and the internal data of the FPU. It allows us to use 18 significant digits. No matter how we represent data in the memory, the FPU always stores it in the 80-bit _real_ format in its registers. Its internal precision is at least 19 decimal digits, so even if we choose to display results as ASCII in the full 18-digit precision, we are still showing correct results. We can perform mathematical operations on the TOS: We can calculate its _sine_, we can _scale_ it (i.e., we can multiply or divide it by a power of 2), we can calculate its base-2 _logarithm_, and many other things. We can also _multiply_ or _divide_ it by, _add_ it to, or _subtract_ it from, any of the FPU registers (including itself). The official Intel op code for the TOS is `st`, and for the _registers_ `st(0)`-`st(7)`. `st` and `st(0)`, then, refer to the same register. For whatever reasons, the original author of nasm has decided to use different op codes, namely `st0`-`st7`. In other words, there are no parentheses, and the TOS is always `st0`, never just `st`. [[x86-fpu-packed-decimal]] ==== The Packed Decimal Format The _packed decimal_ format uses 10 bytes (80 bits) of memory to represent 18 digits. The number represented there is always an _integer_. [TIP] ==== You can use it to get decimal places by multiplying the TOS by a power of 10 first. ==== The highest bit of the highest byte (byte 9) is the _sign bit_: If it is set, the number is _negative_, otherwise, it is _positive_. The rest of the bits of this byte are unused/ignored. The remaining 9 bytes store the 18 digits of the number: 2 digits per byte. The _more significant digit_ is stored in the high _nibble_ (4 bits), the _less significant digit_ in the low _nibble_. That said, you might think that `-1234567` would be stored in the memory like this (using hexadecimal notation): [.programlisting] .... 80 00 00 00 00 00 01 23 45 67 .... Alas it is not! As with everything else of Intel make, even the _packed decimal_ is _little-endian_. That means our `-1234567` is stored like this: [.programlisting] .... 67 45 23 01 00 00 00 00 00 80 .... Remember that, or you will be pulling your hair out in desperation! [NOTE] ==== The book to read-if you can find it-is Richard Startz' http://www.amazon.com/exec/obidos/ASIN/013246604X/whizkidtechnomag[8087/80287/80387 for the IBM PC & Compatibles]. Though it does seem to take the fact about the little-endian storage of the _packed decimal_ for granted. I kid you not about the desperation of trying to figure out what was wrong with the filter I show below _before_ it occurred to me I should try the little-endian order even for this type of data. ==== [[x86-pinhole-photography]] === Excursion to Pinhole Photography To write meaningful software, we must not only understand our programming tools, but also the field we are creating software for. Our next filter will help us whenever we want to build a _pinhole camera_, so, we need some background in _pinhole photography_ before we can continue. [[x86-camera]] ==== The Camera The easiest way to describe any camera ever built is as some empty space enclosed in some lightproof material, with a small hole in the enclosure. The enclosure is usually sturdy (e.g., a box), though sometimes it is flexible (the bellows). It is quite dark inside the camera. However, the hole lets light rays in through a single point (though in some cases there may be several). These light rays form an image, a representation of whatever is outside the camera, in front of the hole. If some light sensitive material (such as film) is placed inside the camera, it can capture the image. The hole often contains a _lens_, or a lens assembly, often called the _objective_. [[x86-the-pinhole]] ==== The Pinhole But, strictly speaking, the lens is not necessary: The original cameras did not use a lens but a _pinhole_. Even today, _pinholes_ are used, both as a tool to study how cameras work, and to achieve a special kind of image. The image produced by the _pinhole_ is all equally sharp. Or _blurred_. There is an ideal size for a pinhole: If it is either larger or smaller, the image loses its sharpness. [[x86-focal-length]] ==== Focal Length This ideal pinhole diameter is a function of the square root of _focal length_, which is the distance of the pinhole from the film. [.programlisting] .... D = PC * sqrt(FL) .... In here, `D` is the ideal diameter of the pinhole, `FL` is the focal length, and `PC` is a pinhole constant. According to Jay Bender, its value is `0.04`, while Kenneth Connors has determined it to be `0.037`. Others have proposed other values. Plus, this value is for the daylight only: Other types of light will require a different constant, whose value can only be determined by experimentation. [[x86-f-number]] ==== The F-Number The f-number is a very useful measure of how much light reaches the film. -A light meter can determine that, for example, to expose a film of specific sensitivity with f5.6 mkay require the exposure to last 1/1000 sec. +A light meter can determine that, for example, to expose a film of specific sensitivity with f/5.6 may require the exposure to last 1/1000 sec. It does not matter whether it is a 35-mm camera, or a 6x9cm camera, etc. As long as we know the f-number, we can determine the proper exposure. The f-number is easy to calculate: [.programlisting] .... F = FL / D .... In other words, the f-number equals the focal length divided by the diameter of the pinhole. It also means a higher f-number either implies a smaller pinhole or a larger focal distance, or both. That, in turn, implies, the higher the f-number, the longer the exposure has to be. Furthermore, while pinhole diameter and focal distance are one-dimensional measurements, both, the film and the pinhole, are two-dimensional. That means that if you have measured the exposure at f-number `A` as `t`, then the exposure at f-number `B` is: [.programlisting] .... t * (B / A)² .... [[x86-normalized-f-number]] ==== Normalized F-Number While many modern cameras can change the diameter of their pinhole, and thus their f-number, quite smoothly and gradually, such was not always the case. To allow for different f-numbers, cameras typically contained a metal plate with several holes of different sizes drilled to them. Their sizes were chosen according to the above formula in such a way that the resultant f-number was one of standard f-numbers used on all cameras everywhere. For example, a very old Kodak Duaflex IV camera in my possession has three such holes for f-numbers 8, 11, and 16. A more recently made camera may offer f-numbers of 2.8, 4, 5.6, 8, 11, 16, 22, and 32 (as well as others). These numbers were not chosen arbitrarily: They all are powers of the square root of 2, though they may be rounded somewha. [[x86-f-stop]] ==== The F-Stop A typical camera is designed in such a way that setting any of the normalized f-numbers changes the feel of the dial. It will naturally _stop_ in that position. Because of that, these positions of the dial are called f-stops. Since the f-numbers at each stop are powers of the square root of 2, moving the dial by 1 stop will double the amount of light required for proper exposure. Moving it by 2 stops will quadruple the required exposure. Moving the dial by 3 stops will require the increase in exposure 8 times, etc. [[x86-pinhole-software]] === Designing the Pinhole Software We are now ready to decide what exactly we want our pinhole software to do. [[xpinhole-processing-input]] ==== Processing Program Input Since its main purpose is to help us design a working pinhole camera, we will use the _focal length_ as the input to the program. This is something we can determine without software: Proper focal length is determined by the size of the film and by the need to shoot "regular" pictures, wide angle pictures, or telephoto pictures. Most of the programs we have written so far worked with individual characters, or bytes, as their input: The hex program converted individual bytes into a hexadecimal number, the csv program either let a character through, or deleted it, or changed it to a different character, etc. One program, ftuc used the state machine to consider at most two input bytes at a time. But our pinhole program cannot just work with individual characters, it has to deal with larger syntactic units. For example, if we want the program to calculate the pinhole diameter (and other values we will discuss later) at the focal lengths of `100 mm`, `150 mm`, and `210 mm`, we may want to enter something like this: [source,shell] .... 100, 150, 210 .... Our program needs to consider more than a single byte of input at a time. When it sees the first `1`, it must understand it is seeing the first digit of a decimal number. When it sees the `0` and the other `0`, it must know it is seeing more digits of the same number. When it encounters the first comma, it must know it is no longer receiving the digits of the first number. It must be able to convert the digits of the first number into the value of `100`. And the digits of the second number into the value of `150`. And, of course, the digits of the third number into the numeric value of `210`. We need to decide what delimiters to accept: Do the input numbers have to be separated by a comma? If so, how do we treat two numbers separated by something else? Personally, I like to keep it simple. Something either is a number, so I process it. Or it is not a number, so I discard it. I do not like the computer complaining about me typing in an extra character when it is _obvious_ that it is an extra character. Duh! Plus, it allows me to break up the monotony of computing and type in a query instead of just a number: [source,shell] .... What is the best pinhole diameter for the focal length of 150? .... There is no reason for the computer to spit out a number of complaints: [source,shell] .... Syntax error: What Syntax error: is Syntax error: the Syntax error: best .... Et cetera, et cetera, et cetera. Secondly, I like the `+#+` character to denote the start of a comment which extends to the end of the line. This does not take too much effort to code, and lets me treat input files for my software as executable scripts. In our case, we also need to decide what units the input should come in: We choose _millimeters_ because that is how most photographers measure the focus length. Finally, we need to decide whether to allow the use of the decimal point (in which case we must also consider the fact that much of the world uses a decimal _comma_). In our case allowing for the decimal point/comma would offer a false sense of precision: There is little if any noticeable difference between the focus lengths of `50` and `51`, so allowing the user to input something like `50.5` is not a good idea. This is my opinion, mind you, but I am the one writing this program. You can make other choices in yours, of course. [[x86-pinhole-options]] ==== Offering Options The most important thing we need to know when building a pinhole camera is the diameter of the pinhole. Since we want to shoot sharp images, we will use the above formula to calculate the pinhole diameter from focal length. As experts are offering several different values for the `PC` constant, we will need to have the choice. It is traditional in UNIX(R) programming to have two main ways of choosing program parameters, plus to have a default for the time the user does not make a choice. Why have two ways of choosing? One is to allow a (relatively) _permanent_ choice that applies automatically each time the software is run without us having to tell it over and over what we want it to do. The permanent choices may be stored in a configuration file, typically found in the user's home directory. The file usually has the same name as the application but is started with a dot. Often _"rc"_ is added to the file name. So, ours could be [.filename]#~/.pinhole# or [.filename]#~/.pinholerc#. (The [.filename]#~/# means current user's home directory.) The configuration file is used mostly by programs that have many configurable parameters. Those that have only one (or a few) often use a different method: They expect to find the parameter in an _environment variable_. In our case, we might look at an environment variable named `PINHOLE`. Usually, a program uses one or the other of the above methods. Otherwise, if a configuration file said one thing, but an environment variable another, the program might get confused (or just too complicated). Because we only need to choose _one_ such parameter, we will go with the second method and search the environment for a variable named `PINHOLE`. The other way allows us to make _ad hoc_ decisions: _"Though I usually want you to use 0.039, this time I want 0.03872."_ In other words, it allows us to _override_ the permanent choice. This type of choice is usually done with command line parameters. Finally, a program _always_ needs a _default_. The user may not make any choices. Perhaps he does not know what to choose. Perhaps he is "just browsing." Preferably, the default will be the value most users would choose anyway. That way they do not need to choose. Or, rather, they can choose the default without an additional effort. Given this system, the program may find conflicting options, and handle them this way: [.procedure] . If it finds an _ad hoc_ choice (e.g., command line parameter), it should accept that choice. It must ignore any permanent choice and any default. . _Otherwise_, if it finds a permanent option (e.g., an environment variable), it should accept it, and ignore the default. . _Otherwise_, it should use the default. We also need to decide what _format_ our `PC` option should have. At first site, it seems obvious to use the `PINHOLE=0.04` format for the environment variable, and `-p0.04` for the command line. Allowing that is actually a security risk. The `PC` constant is a very small number. Naturally, we will test our software using various small values of `PC`. But what will happen if someone runs the program choosing a huge value? It may crash the program because we have not designed it to handle huge numbers. Or, we may spend more time on the program so it can handle huge numbers. We might do that if we were writing commercial software for computer illiterate audience. Or, we might say, _"Tough! The user should know better.""_ Or, we just may make it impossible for the user to enter a huge number. This is the approach we will take: We will use an _implied 0._ prefix. In other words, if the user wants `0.04`, we will expect him to type `-p04`, or set `PINHOLE=04` in his environment. So, if he says `-p9999999`, we will interpret it as ``0.9999999``-still ridiculous but at least safer. Secondly, many users will just want to go with either Bender's constant or Connors' constant. To make it easier on them, we will interpret `-b` as identical to `-p04`, and `-c` as identical to `-p037`. [[x86-pinhole-output]] ==== The Output We need to decide what we want our software to send to the output, and in what format. Since our input allows for an unspecified number of focal length entries, it makes sense to use a traditional database-style output of showing the result of the calculation for each focal length on a separate line, while separating all values on one line by a `tab` character. Optionally, we should also allow the user to specify the use of the CSV format we have studied earlier. In this case, we will print out a line of comma-separated names describing each field of every line, then show our results as before, but substituting a `comma` for the `tab`. We need a command line option for the CSV format. We cannot use `-c` because that already means _use Connors' constant_. For some strange reason, many web sites refer to CSV files as _"Excel spreadsheet"_ (though the CSV format predates Excel). We will, therefore, use the `-e` switch to inform our software we want the output in the CSV format. We will start each line of the output with the focal length. This may sound repetitious at first, especially in the interactive mode: The user types in the focal length, and we are repeating it. But the user can type several focal lengths on one line. The input can also come in from a file or from the output of another program. In that case the user does not see the input at all. By the same token, the output can go to a file which we will want to examine later, or it could go to the printer, or become the input of another program. So, it makes perfect sense to start each line with the focal length as entered by the user. No, wait! Not as entered by the user. What if the user types in something like this: [source,shell] .... 00000000150 .... Clearly, we need to strip those leading zeros. So, we might consider reading the user input as is, converting it to binary inside the FPU, and printing it out from there. But... What if the user types something like this: [source,shell] .... 17459765723452353453534535353530530534563507309676764423 .... Ha! The packed decimal FPU format lets us input 18-digit numbers. But the user has entered more than 18 digits. How do we handle that? Well, we _could_ modify our code to read the first 18 digits, enter it to the FPU, then read more, multiply what we already have on the TOS by 10 raised to the number of additional digits, then `add` to it. Yes, we could do that. But in _this_ program it would be ridiculous (in a different one it may be just the thing to do): Even the circumference of the Earth expressed in millimeters only takes 11 digits. Clearly, we cannot build a camera that large (not yet, anyway). So, if the user enters such a huge number, he is either bored, or testing us, or trying to break into the system, or playing games-doing anything but designing a pinhole camera. What will we do? We will slap him in the face, in a manner of speaking: [source,shell] .... 17459765723452353453534535353530530534563507309676764423 ??? ??? ??? ??? ??? .... To achieve that, we will simply ignore any leading zeros. Once we find a non-zero digit, we will initialize a counter to `0` and start taking three steps: [.procedure] . Send the digit to the output. . Append the digit to a buffer we will use later to produce the packed decimal we can send to the FPU. . Increase the counter. Now, while we are taking these three steps, we also need to watch out for one of two conditions: * If the counter grows above 18, we stop appending to the buffer. We continue reading the digits and sending them to the output. * If, or rather _when_, the next input character is not a digit, we are done inputting for now. + Incidentally, we can simply discard the non-digit, unless it is a `+#+`, which we must return to the input stream. It starts a comment, so we must see it after we are done producing output and start looking for more input. That still leaves one possibility uncovered: If all the user enters is a zero (or several zeros), we will never find a non-zero to display. We can determine this has happened whenever our counter stays at `0`. In that case we need to send `0` to the output, and perform another "slap in the face": [source,shell] .... 0 ??? ??? ??? ??? ??? .... Once we have displayed the focal length and determined it is valid (greater than `0` but not exceeding 18 digits), we can calculate the pinhole diameter. It is not by coincidence that _pinhole_ contains the word _pin_. Indeed, many a pinhole literally is a _pin hole_, a hole carefully punched with the tip of a pin. That is because a typical pinhole is very small. Our formula gets the result in millimeters. We will multiply it by `1000`, so we can output the result in _microns_. At this point we have yet another trap to face: _Too much precision._ Yes, the FPU was designed for high precision mathematics. But we are not dealing with high precision mathematics. We are dealing with physics (optics, specifically). Suppose we want to convert a truck into a pinhole camera (we would not be the first ones to do that!). Suppose its box is `12` meters long, so we have the focal length of `12000`. Well, using Bender's constant, it gives us square root of `12000` multiplied by `0.04`, which is `4.381780460` millimeters, or `4381.780460` microns. Put either way, the result is absurdly precise. Our truck is not _exactly_ `12000` millimeters long. We did not measure its length with such a precision, so stating we need a pinhole with the diameter of `4.381780460` millimeters is, well, deceiving. `4.4` millimeters would do just fine. [NOTE] ==== I "only" used ten digits in the above example. Imagine the absurdity of going for all 18! ==== We need to limit the number of significant digits of our result. One way of doing it is by using an integer representing microns. So, our truck would need a pinhole with the diameter of `4382` microns. Looking at that number, we still decide that `4400` microns, or `4.4` millimeters is close enough. Additionally, we can decide that no matter how big a result we get, we only want to display four significant digits (or any other number of them, of course). Alas, the FPU does not offer rounding to a specific number of digits (after all, it does not view the numbers as decimal but as binary). We, therefore, must devise an algorithm to reduce the number of significant digits. Here is mine (I think it is awkward-if you know a better one, _please_, let me know): [.procedure] . Initialize a counter to `0`. . While the number is greater than or equal to `10000`, divide it by `10` and increase the counter. . Output the result. . While the counter is greater than `0`, output `0` and decrease the counter. [NOTE] ==== The `10000` is only good if you want _four_ significant digits. For any other number of significant digits, replace `10000` with `10` raised to the number of significant digits. ==== We will, then, output the pinhole diameter in microns, rounded off to four significant digits. At this point, we know the _focal length_ and the _pinhole diameter_. That means we have enough information to also calculate the _f-number_. We will display the f-number, rounded to four significant digits. Chances are the f-number will tell us very little. To make it more meaningful, we can find the nearest _normalized f-number_, i.e., the nearest power of the square root of 2. We do that by multiplying the actual f-number by itself, which, of course, will give us its `square`. We will then calculate its base-2 logarithm, which is much easier to do than calculating the base-square-root-of-2 logarithm! We will round the result to the nearest integer. Next, we will raise 2 to the result. Actually, the FPU gives us a good shortcut to do that: We can use the `fscale` op code to "scale" 1, which is analogous to ``shift``ing an integer left. Finally, we calculate the square root of it all, and we have the nearest normalized f-number. If all that sounds overwhelming-or too much work, perhaps-it may become much clearer if you see the code. It takes 9 op codes altogether: [.programlisting] .... fmul st0, st0 fld1 fld st1 fyl2x frndint fld1 fscale fsqrt fstp st1 .... The first line, `fmul st0, st0`, squares the contents of the TOS (top of the stack, same as `st`, called `st0` by nasm). The `fld1` pushes `1` on the TOS. The next line, `fld st1`, pushes the square back to the TOS. At this point the square is both in `st` and `st(2)` (it will become clear why we leave a second copy on the stack in a moment). `st(1)` contains `1`. Next, `fyl2x` calculates base-2 logarithm of `st` multiplied by `st(1)`. That is why we placed `1` on `st(1)` before. At this point, `st` contains the logarithm we have just calculated, `st(1)` contains the square of the actual f-number we saved for later. `frndint` rounds the TOS to the nearest integer. `fld1` pushes a `1`. `fscale` shifts the `1` we have on the TOS by the value in `st(1)`, effectively raising 2 to `st(1)`. Finally, `fsqrt` calculates the square root of the result, i.e., the nearest normalized f-number. We now have the nearest normalized f-number on the TOS, the base-2 logarithm rounded to the nearest integer in `st(1)`, and the square of the actual f-number in `st(2)`. We are saving the value in `st(2)` for later. But we do not need the contents of `st(1)` anymore. The last line, `fstp st1`, places the contents of `st` to `st(1)`, and pops. As a result, what was `st(1)` is now `st`, what was `st(2)` is now `st(1)`, etc. The new `st` contains the normalized f-number. The new `st(1)` contains the square of the actual f-number we have stored there for posterity. At this point, we are ready to output the normalized f-number. Because it is normalized, we will not round it off to four significant digits, but will send it out in its full precision. The normalized f-number is useful as long as it is reasonably small and can be found on our light meter. Otherwise we need a different method of determining proper exposure. Earlier we have figured out the formula of calculating proper exposure at an arbitrary f-number from that measured at a different f-number. Every light meter I have ever seen can determine proper exposure at f5.6. We will, therefore, calculate an _"f5.6 multiplier,"_ i.e., by how much we need to multiply the exposure measured at f5.6 to determine the proper exposure for our pinhole camera. From the above formula we know this factor can be calculated by dividing our f-number (the actual one, not the normalized one) by `5.6`, and squaring the result. Mathematically, dividing the square of our f-number by the square of `5.6` will give us the same result. Computationally, we do not want to square two numbers when we can only square one. So, the first solution seems better at first. But... `5.6` is a _constant_. We do not have to have our FPU waste precious cycles. We can just tell it to divide the square of the f-number by whatever `5.6²` equals to. Or we can divide the f-number by `5.6`, and then square the result. The two ways now seem equal. But, they are not! Having studied the principles of photography above, we remember that the `5.6` is actually square root of 2 raised to the fifth power. An _irrational_ number. The square of this number is _exactly_ `32`. Not only is `32` an integer, it is a power of 2. We do not need to divide the square of the f-number by `32`. We only need to use `fscale` to shift it right by five positions. In the FPU lingo it means we will `fscale` it with `st(1)` equal to `-5`. That is _much faster_ than a division. So, now it has become clear why we have saved the square of the f-number on the top of the FPU stack. The calculation of the f5.6 multiplier is the easiest calculation of this entire program! We will output it rounded to four significant digits. There is one more useful number we can calculate: The number of stops our f-number is from f5.6. This may help us if our f-number is just outside the range of our light meter, but we have a shutter which lets us set various speeds, and this shutter uses stops. Say, our f-number is 5 stops from f5.6, and the light meter says we should use 1/1000 sec. Then we can set our shutter speed to 1/1000 first, then move the dial by 5 stops. This calculation is quite easy as well. All we have to do is to calculate the base-2 logarithm of the f5.6 multiplier we had just calculated (though we need its value from before we rounded it off). We then output the result rounded to the nearest integer. We do not need to worry about having more than four significant digits in this one: The result is most likely to have only one or two digits anyway. [[x86-fpu-optimizations]] === FPU Optimizations In assembly language we can optimize the FPU code in ways impossible in high languages, including C. Whenever a C function needs to calculate a floating-point value, it loads all necessary variables and constants into FPU registers. It then does whatever calculation is required to get the correct result. Good C compilers can optimize that part of the code really well. It "returns" the value by leaving the result on the TOS. However, before it returns, it cleans up. Any variables and constants it used in its calculation are now gone from the FPU. It cannot do what we just did above: We calculated the square of the f-number and kept it on the stack for later use by another function. We _knew_ we would need that value later on. We also knew we had enough room on the stack (which only has room for 8 numbers) to store it there. A C compiler has no way of knowing that a value it has on the stack will be required again in the very near future. Of course, the C programmer may know it. But the only recourse he has is to store the value in a memory variable. That means, for one, the value will be changed from the 80-bit precision used internally by the FPU to a C _double_ (64 bits) or even _single_ (32 bits). That also means that the value must be moved from the TOS into the memory, and then back again. Alas, of all FPU operations, the ones that access the computer memory are the slowest. So, whenever programming the FPU in assembly language, look for the ways of keeping intermediate results on the FPU stack. We can take that idea even further! In our program we are using a _constant_ (the one we named `PC`). It does not matter how many pinhole diameters we are calculating: 1, 10, 20, 1000, we are always using the same constant. Therefore, we can optimize our program by keeping the constant on the stack all the time. Early on in our program, we are calculating the value of the above constant. We need to divide our input by `10` for every digit in the constant. It is much faster to multiply than to divide. So, at the start of our program, we divide `10` into `1` to obtain `0.1`, which we then keep on the stack: Instead of dividing the input by `10` for every digit, we multiply it by `0.1`. By the way, we do not input `0.1` directly, even though we could. We have a reason for that: While `0.1` can be expressed with just one decimal place, we do not know how many _binary_ places it takes. We, therefore, let the FPU calculate its binary value to its own high precision. We are using other constants: We multiply the pinhole diameter by `1000` to convert it from millimeters to microns. We compare numbers to `10000` when we are rounding them off to four significant digits. So, we keep both, `1000` and `10000`, on the stack. And, of course, we reuse the `0.1` when rounding off numbers to four digits. Last but not least, we keep `-5` on the stack. We need it to scale the square of the f-number, instead of dividing it by `32`. It is not by coincidence we load this constant last. That makes it the top of the stack when only the constants are on it. So, when the square of the f-number is being scaled, the `-5` is at `st(1)`, precisely where `fscale` expects it to be. It is common to create certain constants from scratch instead of loading them from the memory. That is what we are doing with `-5`: [.programlisting] .... fld1 ; TOS = 1 fadd st0, st0 ; TOS = 2 fadd st0, st0 ; TOS = 4 fld1 ; TOS = 1 faddp st1, st0 ; TOS = 5 fchs ; TOS = -5 .... We can generalize all these optimizations into one rule: _Keep repeat values on the stack!_ [TIP] ==== _PostScript(R)_ is a stack-oriented programming language. There are many more books available about PostScript(R) than about the FPU assembly language: Mastering PostScript(R) will help you master the FPU. ==== [[x86-pinhole-the-code]] === pinhole-The Code [.programlisting] .... ;;;;;;; pinhole.asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; ; Find various parameters of a pinhole camera construction and use ; ; Started: 9-Jun-2001 ; Updated: 10-Jun-2001 ; ; Copyright (c) 2001 G. Adam Stanislav ; All rights reserved. ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; %include 'system.inc' %define BUFSIZE 2048 section .data align 4 ten dd 10 thousand dd 1000 tthou dd 10000 fd.in dd stdin fd.out dd stdout envar db 'PINHOLE=' ; Exactly 8 bytes, or 2 dwords long pinhole db '04,', ; Bender's constant (0.04) connors db '037', 0Ah ; Connors' constant usg db 'Usage: pinhole [-b] [-c] [-e] [-p ] [-o ] [-i ]', 0Ah usglen equ $-usg iemsg db "pinhole: Can't open input file", 0Ah iemlen equ $-iemsg oemsg db "pinhole: Can't create output file", 0Ah oemlen equ $-oemsg pinmsg db "pinhole: The PINHOLE constant must not be 0", 0Ah pinlen equ $-pinmsg toobig db "pinhole: The PINHOLE constant may not exceed 18 decimal places", 0Ah biglen equ $-toobig huhmsg db 9, '???' separ db 9, '???' sep2 db 9, '???' sep3 db 9, '???' sep4 db 9, '???', 0Ah huhlen equ $-huhmsg header db 'focal length in millimeters,pinhole diameter in microns,' db 'F-number,normalized F-number,F-5.6 multiplier,stops ' db 'from F-5.6', 0Ah headlen equ $-header section .bss ibuffer resb BUFSIZE obuffer resb BUFSIZE dbuffer resb 20 ; decimal input buffer bbuffer resb 10 ; BCD buffer section .text align 4 huh: call write push dword huhlen push dword huhmsg push dword [fd.out] sys.write add esp, byte 12 ret align 4 perr: push dword pinlen push dword pinmsg push dword stderr sys.write push dword 4 ; return failure sys.exit align 4 consttoobig: push dword biglen push dword toobig push dword stderr sys.write push dword 5 ; return failure sys.exit align 4 ierr: push dword iemlen push dword iemsg push dword stderr sys.write push dword 1 ; return failure sys.exit align 4 oerr: push dword oemlen push dword oemsg push dword stderr sys.write push dword 2 sys.exit align 4 usage: push dword usglen push dword usg push dword stderr sys.write push dword 3 sys.exit align 4 global _start _start: add esp, byte 8 ; discard argc and argv[0] sub esi, esi .arg: pop ecx or ecx, ecx je near .getenv ; no more arguments ; ECX contains the pointer to an argument cmp byte [ecx], '-' jne usage inc ecx mov ax, [ecx] inc ecx .o: cmp al, 'o' jne .i ; Make sure we are not asked for the output file twice cmp dword [fd.out], stdout jne usage ; Find the path to output file - it is either at [ECX+1], ; i.e., -ofile -- ; or in the next argument, ; i.e., -o file or ah, ah jne .openoutput pop ecx jecxz usage .openoutput: push dword 420 ; file mode (644 octal) push dword 0200h | 0400h | 01h ; O_CREAT | O_TRUNC | O_WRONLY push ecx sys.open jc near oerr add esp, byte 12 mov [fd.out], eax jmp short .arg .i: cmp al, 'i' jne .p ; Make sure we are not asked twice cmp dword [fd.in], stdin jne near usage ; Find the path to the input file or ah, ah jne .openinput pop ecx or ecx, ecx je near usage .openinput: push dword 0 ; O_RDONLY push ecx sys.open jc near ierr ; open failed add esp, byte 8 mov [fd.in], eax jmp .arg .p: cmp al, 'p' jne .c or ah, ah jne .pcheck pop ecx or ecx, ecx je near usage mov ah, [ecx] .pcheck: cmp ah, '0' jl near usage cmp ah, '9' ja near usage mov esi, ecx jmp .arg .c: cmp al, 'c' jne .b or ah, ah jne near usage mov esi, connors jmp .arg .b: cmp al, 'b' jne .e or ah, ah jne near usage mov esi, pinhole jmp .arg .e: cmp al, 'e' jne near usage or ah, ah jne near usage mov al, ',' mov [huhmsg], al mov [separ], al mov [sep2], al mov [sep3], al mov [sep4], al jmp .arg align 4 .getenv: ; If ESI = 0, we did not have a -p argument, ; and need to check the environment for "PINHOLE=" or esi, esi jne .init sub ecx, ecx .nextenv: pop esi or esi, esi je .default ; no PINHOLE envar found ; check if this envar starts with 'PINHOLE=' mov edi, envar mov cl, 2 ; 'PINHOLE=' is 2 dwords long rep cmpsd jne .nextenv ; Check if it is followed by a digit mov al, [esi] cmp al, '0' jl .default cmp al, '9' jbe .init ; fall through align 4 .default: ; We got here because we had no -p argument, ; and did not find the PINHOLE envar. mov esi, pinhole ; fall through align 4 .init: sub eax, eax sub ebx, ebx sub ecx, ecx sub edx, edx mov edi, dbuffer+1 mov byte [dbuffer], '0' ; Convert the pinhole constant to real .constloop: lodsb cmp al, '9' ja .setconst cmp al, '0' je .processconst jb .setconst inc dl .processconst: inc cl cmp cl, 18 ja near consttoobig stosb jmp short .constloop align 4 .setconst: or dl, dl je near perr finit fild dword [tthou] fld1 fild dword [ten] fdivp st1, st0 fild dword [thousand] mov edi, obuffer mov ebp, ecx call bcdload .constdiv: fmul st0, st2 loop .constdiv fld1 fadd st0, st0 fadd st0, st0 fld1 faddp st1, st0 fchs ; If we are creating a CSV file, ; print header cmp byte [separ], ',' jne .bigloop push dword headlen push dword header push dword [fd.out] sys.write .bigloop: call getchar jc near done ; Skip to the end of the line if you got '#' cmp al, '#' jne .num call skiptoeol jmp short .bigloop .num: ; See if you got a number cmp al, '0' jl .bigloop cmp al, '9' ja .bigloop ; Yes, we have a number sub ebp, ebp sub edx, edx .number: cmp al, '0' je .number0 mov dl, 1 .number0: or dl, dl ; Skip leading 0's je .nextnumber push eax call putchar pop eax inc ebp cmp ebp, 19 jae .nextnumber mov [dbuffer+ebp], al .nextnumber: call getchar jc .work cmp al, '#' je .ungetc cmp al, '0' jl .work cmp al, '9' ja .work jmp short .number .ungetc: dec esi inc ebx .work: ; Now, do all the work or dl, dl je near .work0 cmp ebp, 19 jae near .toobig call bcdload ; Calculate pinhole diameter fld st0 ; save it fsqrt fmul st0, st3 fld st0 fmul st5 sub ebp, ebp ; Round off to 4 significant digits .diameter: fcom st0, st7 fstsw ax sahf jb .printdiameter fmul st0, st6 inc ebp jmp short .diameter .printdiameter: call printnumber ; pinhole diameter ; Calculate F-number fdivp st1, st0 fld st0 sub ebp, ebp .fnumber: fcom st0, st6 fstsw ax sahf jb .printfnumber fmul st0, st5 inc ebp jmp short .fnumber .printfnumber: call printnumber ; F number ; Calculate normalized F-number fmul st0, st0 fld1 fld st1 fyl2x frndint fld1 fscale fsqrt fstp st1 sub ebp, ebp call printnumber ; Calculate time multiplier from F-5.6 fscale fld st0 ; Round off to 4 significant digits .fmul: fcom st0, st6 fstsw ax sahf jb .printfmul inc ebp fmul st0, st5 jmp short .fmul .printfmul: call printnumber ; F multiplier ; Calculate F-stops from 5.6 fld1 fxch st1 fyl2x sub ebp, ebp call printnumber mov al, 0Ah call putchar jmp .bigloop .work0: mov al, '0' call putchar align 4 .toobig: call huh jmp .bigloop align 4 done: call write ; flush output buffer ; close files push dword [fd.in] sys.close push dword [fd.out] sys.close finit ; return success push dword 0 sys.exit align 4 skiptoeol: ; Keep reading until you come to cr, lf, or eof call getchar jc done cmp al, 0Ah jne .cr ret .cr: cmp al, 0Dh jne skiptoeol ret align 4 getchar: or ebx, ebx jne .fetch call read .fetch: lodsb dec ebx clc ret read: jecxz .read call write .read: push dword BUFSIZE mov esi, ibuffer push esi push dword [fd.in] sys.read add esp, byte 12 mov ebx, eax or eax, eax je .empty sub eax, eax ret align 4 .empty: add esp, byte 4 stc ret align 4 putchar: stosb inc ecx cmp ecx, BUFSIZE je write ret align 4 write: jecxz .ret ; nothing to write sub edi, ecx ; start of buffer push ecx push edi push dword [fd.out] sys.write add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now .ret: ret align 4 bcdload: ; EBP contains the number of chars in dbuffer push ecx push esi push edi lea ecx, [ebp+1] lea esi, [dbuffer+ebp-1] shr ecx, 1 std mov edi, bbuffer sub eax, eax mov [edi], eax mov [edi+4], eax mov [edi+2], ax .loop: lodsw sub ax, 3030h shl al, 4 or al, ah mov [edi], al inc edi loop .loop fbld [bbuffer] cld pop edi pop esi pop ecx sub eax, eax ret align 4 printnumber: push ebp mov al, [separ] call putchar ; Print the integer at the TOS mov ebp, bbuffer+9 fbstp [bbuffer] ; Check the sign mov al, [ebp] dec ebp or al, al jns .leading ; We got a negative number (should never happen) mov al, '-' call putchar .leading: ; Skip leading zeros mov al, [ebp] dec ebp or al, al jne .first cmp ebp, bbuffer jae .leading ; We are here because the result was 0. ; Print '0' and return mov al, '0' jmp putchar .first: ; We have found the first non-zero. ; But it is still packed test al, 0F0h jz .second push eax shr al, 4 add al, '0' call putchar pop eax and al, 0Fh .second: add al, '0' call putchar .next: cmp ebp, bbuffer jb .done mov al, [ebp] push eax shr al, 4 add al, '0' call putchar pop eax and al, 0Fh add al, '0' call putchar dec ebp jmp short .next .done: pop ebp or ebp, ebp je .ret .zeros: mov al, '0' call putchar dec ebp jne .zeros .ret: ret .... The code follows the same format as all the other filters we have seen before, with one subtle exception: ____ We are no longer assuming that the end of input implies the end of things to do, something we took for granted in the _character-oriented_ filters. This filter does not process characters. It processes a _language_ (albeit a very simple one, consisting only of numbers). When we have no more input, it can mean one of two things: * We are done and can quit. This is the same as before. * The last character we have read was a digit. We have stored it at the end of our ASCII-to-float conversion buffer. We now need to convert the contents of that buffer into a number and write the last line of our output. For that reason, we have modified our `getchar` and our `read` routines to return with the `carry flag` _clear_ whenever we are fetching another character from the input, or the `carry flag` _set_ whenever there is no more input. Of course, we are still using assembly language magic to do that! Take a good look at `getchar`. It _always_ returns with the `carry flag` _clear_. Yet, our main code relies on the `carry flag` to tell it when to quit-and it works. The magic is in `read`. Whenever it receives more input from the system, it just returns to `getchar`, which fetches a character from the input buffer, _clears_ the `carry flag` and returns. But when `read` receives no more input from the system, it does _not_ return to `getchar` at all. Instead, the `add esp, byte 4` op code adds `4` to `ESP`, _sets_ the `carry flag`, and returns. So, where does it return to? Whenever a program uses the `call` op code, the microprocessor ``push``es the return address, i.e., it stores it on the top of the stack (not the FPU stack, the system stack, which is in the memory). When a program uses the `ret` op code, the microprocessor ``pop``s the return value from the stack, and jumps to the address that was stored there. But since we added `4` to `ESP` (which is the stack pointer register), we have effectively given the microprocessor a minor case of _amnesia_: It no longer remembers it was `getchar` that ``call``ed `read`. And since `getchar` never ``push``ed anything before ``call``ing `read`, the top of the stack now contains the return address to whatever or whoever ``call``ed `getchar`. As far as that caller is concerned, he ``call``ed `getchar`, which ``ret``urned with the `carry flag` set! ____ Other than that, the `bcdload` routine is caught up in the middle of a Lilliputian conflict between the Big-Endians and the Little-Endians. It is converting the text representation of a number into that number: The text is stored in the big-endian order, but the _packed decimal_ is little-endian. To solve the conflict, we use the `std` op code early on. We cancel it with `cld` later on: It is quite important we do not `call` anything that may depend on the default setting of the _direction flag_ while `std` is active. Everything else in this code should be quit eclear, providing you have read the entire chapter that precedes it. It is a classical example of the adage that programming requires a lot of thought and only a little coding. Once we have thought through every tiny detail, the code almost writes itself. [[x86-pinhole-using]] === Using pinhole Because we have decided to make the program _ignore_ any input except for numbers (and even those inside a comment), we can actually perform _textual queries_. We do not _have to_, but we _can_. In my humble opinion, forming a textual query, instead of having to follow a very strict syntax, makes software much more user friendly. Suppose we want to build a pinhole camera to use the 4x5 inch film. The standard focal length for that film is about 150mm. We want to _fine-tune_ our focal length so the pinhole diameter is as round a number as possible. Let us also suppose we are quite comfortable with cameras but somewhat intimidated by computers. Rather than just have to type in a bunch of numbers, we want to _ask_ a couple of questions. Our session might look like this: [source,shell] .... % pinhole Computer, What size pinhole do I need for the focal length of 150? 150 490 306 362 2930 12 Hmmm... How about 160? 160 506 316 362 3125 12 Let's make it 155, please. 155 498 311 362 3027 12 Ah, let's try 157... 157 501 313 362 3066 12 156? 156 500 312 362 3047 12 That's it! Perfect! Thank you very much! ^D .... We have found that while for the focal length of 150, our pinhole diameter should be 490 microns, or 0.49 mm, if we go with the almost identical focal length of 156 mm, we can get away with a pinhole diameter of exactly one half of a millimeter. [[x86-pinhole-scripting]] === Scripting Because we have chosen the `+#+` character to denote the start of a comment, we can treat our pinhole software as a _scripting language_. You have probably seen shell _scripts_ that start with: [.programlisting] .... #! /bin/sh .... ...or... [.programlisting] .... #!/bin/sh .... ...because the blank space after the `#!` is optional. Whenever UNIX(R) is asked to run an executable file which starts with the `#!`, it assumes the file is a script. It adds the command to the rest of the first line of the script, and tries to execute that. Suppose now that we have installed pinhole in /usr/local/bin/, we can now write a script to calculate various pinhole diameters suitable for various focal lengths commonly used with the 120 film. The script might look something like this: [.programlisting] .... #! /usr/local/bin/pinhole -b -i # Find the best pinhole diameter # for the 120 film ### Standard 80 ### Wide angle 30, 40, 50, 60, 70 ### Telephoto 100, 120, 140 .... Because 120 is a medium size film, we may name this file medium. We can set its permissions to execute, and run it as if it were a program: [source,shell] .... % chmod 755 medium % ./medium .... UNIX(R) will interpret that last command as: [source,shell] .... % /usr/local/bin/pinhole -b -i ./medium .... It will run that command and display: [source,shell] .... 80 358 224 256 1562 11 30 219 137 128 586 9 40 253 158 181 781 10 50 283 177 181 977 10 60 310 194 181 1172 10 70 335 209 181 1367 10 100 400 250 256 1953 11 120 438 274 256 2344 11 140 473 296 256 2734 11 .... Now, let us enter: [source,shell] .... % ./medium -c .... UNIX(R) will treat that as: [source,shell] .... % /usr/local/bin/pinhole -b -i ./medium -c .... That gives it two conflicting options: `-b` and `-c` (Use Bender's constant and use Connors' constant). We have programmed it so later options override early ones-our program will calculate everything using Connors' constant: [source,shell] .... 80 331 242 256 1826 11 30 203 148 128 685 9 40 234 171 181 913 10 50 262 191 181 1141 10 60 287 209 181 1370 10 70 310 226 256 1598 11 100 370 270 256 2283 11 120 405 296 256 2739 11 140 438 320 362 3196 12 .... We decide we want to go with Bender's constant after all. We want to save its values as a comma-separated file: [source,shell] .... % ./medium -b -e > bender % cat bender focal length in millimeters,pinhole diameter in microns,F-number,normalized F-number,F-5.6 multiplier,stops from F-5.6 80,358,224,256,1562,11 30,219,137,128,586,9 40,253,158,181,781,10 50,283,177,181,977,10 60,310,194,181,1172,10 70,335,209,181,1367,10 100,400,250,256,1953,11 120,438,274,256,2344,11 140,473,296,256,2734,11 % .... [[x86-caveats]] == Caveats Assembly language programmers who "grew up" under MS-DOS(R) and Windows(R) often tend to take shortcuts. Reading the keyboard scan codes and writing directly to video memory are two classical examples of practices which, under MS-DOS(R) are not frowned upon but considered the right thing to do. The reason? Both the PC BIOS and MS-DOS(R) are notoriously slow when performing these operations. You may be tempted to continue similar practices in the UNIX(R) environment. For example, I have seen a web site which explains how to access the keyboard scan codes on a popular UNIX(R) clone. That is generally a _very bad idea_ in UNIX(R) environment! Let me explain why. [[x86-protected]] === UNIX(R) Is Protected For one thing, it may simply not be possible. UNIX(R) runs in protected mode. Only the kernel and device drivers are allowed to access hardware directly. Perhaps a particular UNIX(R) clone will let you read the keyboard scan codes, but chances are a real UNIX(R) operating system will not. And even if one version may let you do it, the next one may not, so your carefully crafted software may become a dinosaur overnight. [[x86-abstraction]] === UNIX(R) Is an Abstraction But there is a much more important reason not to try accessing the hardware directly (unless, of course, you are writing a device driver), even on the UNIX(R) like systems that let you do it: _UNIX(R) is an abstraction!_ There is a major difference in the philosophy of design between MS-DOS(R) and UNIX(R). MS-DOS(R) was designed as a single-user system. It is run on a computer with a keyboard and a video screen attached directly to that computer. User input is almost guaranteed to come from that keyboard. Your program's output virtually always ends up on that screen. This is NEVER guaranteed under UNIX(R). It is quite common for a UNIX(R) user to pipe and redirect program input and output: [source,shell] .... % program1 | program2 | program3 > file1 .... If you have written program2, your input does not come from the keyboard but from the output of program1. Similarly, your output does not go to the screen but becomes the input for program3 whose output, in turn, goes to [.filename]#file1#. But there is more! Even if you made sure that your input comes from, and your output goes to, the terminal, there is no guarantee the terminal is a PC: It may not have its video memory where you expect it, nor may its keyboard be producing PC-style scan codes. It may be a Macintosh(R), or any other computer. Now you may be shaking your head: My software is in PC assembly language, how can it run on a Macintosh(R)? But I did not say your software would be running on a Macintosh(R), only that its terminal may be a Macintosh(R). Under UNIX(R), the terminal does not have to be directly attached to the computer that runs your software, it can even be on another continent, or, for that matter, on another planet. It is perfectly possible that a Macintosh(R) user in Australia connects to a UNIX(R) system in North America (or anywhere else) via telnet. The software then runs on one computer, while the terminal is on a different computer: If you try to read the scan codes, you will get the wrong input! Same holds true about any other hardware: A file you are reading may be on a disk you have no direct access to. A camera you are reading images from may be on a space shuttle, connected to you via satellites. That is why under UNIX(R) you must never make any assumptions about where your data is coming from and going to. Always let the system handle the physical access to the hardware. [NOTE] ==== These are caveats, not absolute rules. Exceptions are possible. For example, if a text editor has determined it is running on a local machine, it may want to read the scan codes directly for improved control. I am not mentioning these caveats to tell you what to do or what not to do, just to make you aware of certain pitfalls that await you if you have just arrived to UNIX(R) form MS-DOS(R). Of course, creative people often break rules, and it is OK as long as they know they are breaking them and why. ==== [[x86-acknowledgements]] == Acknowledgements This tutorial would never have been possible without the help of many experienced FreeBSD programmers from the {freebsd-hackers}, many of whom have patiently answered my questions, and pointed me in the right direction in my attempts to explore the inner workings of UNIX(R) system programming in general and FreeBSD in particular. Thomas M. Sommers opened the door for me . His https://web.archive.org/web/20090914064615/http://www.codebreakers-journal.com/content/view/262/27[How do I write "Hello, world" in FreeBSD assembler?] web page was my first encounter with an example of assembly language programming under FreeBSD. Jake Burkholder has kept the door open by willingly answering all of my questions and supplying me with example assembly language source code. Copyright (R) 2000-2001 G. Adam Stanislav. All rights reserved. diff --git a/documentation/content/en/books/porters-handbook/uses/_index.adoc b/documentation/content/en/books/porters-handbook/uses/_index.adoc index 84c6123b80..3f23a7e0f4 100644 --- a/documentation/content/en/books/porters-handbook/uses/_index.adoc +++ b/documentation/content/en/books/porters-handbook/uses/_index.adoc @@ -1,3141 +1,3141 @@ --- title: Chapter 17. Using USES Macros prev: books/porters-handbook/keeping-up next: books/porters-handbook/versions description: USES macros make it easy to declare requirements and settings for a FreeBSD Port tags: ["uses", "macros", "introduction", "guide"] showBookMenu: true weight: 17 params: path: "/books/porters-handbook/uses/" --- [[uses]] = Using `USES` Macros :doctype: book :toc: macro :toclevels: 1 :icons: font :sectnums: :sectnumlevels: 6 :sectnumoffset: 17 :partnums: :source-highlighter: rouge :experimental: :images-path: books/porters-handbook/ ifdef::env-beastie[] ifdef::backend-html5[] :imagesdir: ../../../../images/{images-path} endif::[] ifndef::book[] include::shared/authors.adoc[] include::shared/mirrors.adoc[] include::shared/releases.adoc[] include::shared/attributes/attributes-{{% lang %}}.adoc[] include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang %}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[] toc::[] endif::[] ifdef::backend-pdf,backend-epub3[] include::../../../../../shared/asciidoctor.adoc[] endif::[] endif::[] ifndef::env-beastie[] toc::[] include::../../../../../shared/asciidoctor.adoc[] endif::[] [[uses-intro]] == An Introduction to `USES` `USES` macros make it easy to declare requirements and settings for a port. They can add dependencies, change building behavior, add metadata to packages, and so on, all by selecting simple, preset values. Each section in this chapter describes a possible value for `USES`, along with its possible arguments. Arguments are appended to the value after a colon (`:`). Multiple arguments are separated by commas (`,`). [[uses-intro-ex1]] .Using Multiple Values [example] ==== [.programlisting] .... USES= bison perl .... ==== [[uses-intro-ex2]] .Adding an Argument [example] ==== [.programlisting] .... USES= tar:xz .... ==== [[uses-intro-ex3]] .Adding Multiple Arguments [example] ==== [.programlisting] .... USES= drupal:7,theme .... ==== [[uses-intro-ex4]] .Mixing it All Together [example] ==== [.programlisting] .... USES= pgsql:9.3+ cpe python:2.7,build .... ==== [[uses-7z]] == `7z` Possible arguments: (none), `p7zip`, `partial` Extract using man:7z[1] instead of man:bsdtar[1] and sets `EXTRACT_SUFX=.7z`. The `p7zip` option forces a dependency on the `7z` from package:archivers/p7zip[] if the one from the base system is not able to extract the files. `EXTRACT_SUFX` is not changed if the `partial` option is used, this can be used if the main distribution file does not have a [.filename]#.7z# extension. [[uses-ada]] == `ada` Possible arguments: (none), `6`, `12`, `(run)` Depends on an Ada-capable compiler, and sets `CC` accordingly. Defaults to use `gcc6-aux` from ports. [[uses-angr]] == `angr` Possible arguments: `binaries`, `nose` Provide support for ports that need the https://github.com/angr/angr[angrinary analysis platform]. If the `binaries` argument is present, the port requires the special `angr` binaries for testing. If the `nose` argument is present, the port uses `nosetests` for the test target. This argument implies `USES=python:test`. The framework provides the following variables to be set by the port: `ANGR_VERSION`:: The version of the `angr` project programs. `ANGR_BINARIES_TAGNAME`:: The tagname of the `angr` binaries. `ANGR_NOSETESTS`:: The path to the `nosetests` program. [[uses-ansible]] == `ansible` Possible arguments: `env`, `module`, `plugin` Provide support for ports depending on package:sysutils/ansible[]. If the `env` argument is present, the port does not depend on package:sysutils/ansible[] but needs some Ansible variables set. If the `module` argument is present then the port is an Ansible module. If the `plugin` argument is present then the port is an Ansible plugin. The framework exposes the following variables to the port: `ANSIBLE_CMD`:: Path to the ansible program. `ANSIBLE_DOC_CMD`:: Path to the ansible-doc program. `ANSIBLE_RUN_DEPENDS`:: RUN_DEPENDS with the Ansible port. `ANSIBLE_DATADIR`:: Path to the root of the directory structure where all Ansible's modules and plugins are stored. `ANSIBLE_ETCDIR`:: Path to the Ansible etc directory. `ANSIBLE_PLUGINS_PREFIX`:: Path to the "plugins" directory within `${ANSIBLE_DATADIR}`. `ANSIBLE_MODULESDIR`:: Path to the directory for local Ansible modules. `ANSIBLE_PLUGINSDIR`:: Path to the directory for local Ansible plugins. `ANSIBLE_PLUGIN_TYPE`:: Ansible plugin type (e.g., "connection", "inventory", or "vars"). [[uses-apache]] == `apache` Possible arguments: (none), `2.4`, `build`, `run`, `server` Provide support for ports depending on the Apache web server. The version argument can be used to require a specific Apache httpd version. It is possible to set a specific version (`USES=apache:2.4`) a minimum version (`USES=apache:2.4+`) or a maximum version (`USES=apache:-2.4`). If the `build` argument is provided a build dependency is added to the port. If the `run` argument is provided a run dependency is added to the port. If the `server` argument is provided then it indicates the port is a server port. The framework provides the following variables to be set by the port: `AP_FAST_BUILD`:: Automatic module build `AP_GENPLIST`:: Automatic `PLIST` generation plus add the module disabled into [.filename]#httpd.conf# (only if no `pkg-plist` exist) `MODULENAME`:: Name of the Apache module. Default: `${PORTNAME}` `SHORTMODNAME`:: Short name of the Apache module. Default: `${MODULENAME:S/mod_//}` `SRC_FILE`:: Source file of the APACHE module. Default: `${MODULENAME}.c` The following variables can be accessed by the port: `APACHE_VERSION`:: The major-minor release version of the chosen Apache server, e.g. 2.4 `APACHEETCDIR`:: Location of the Apache configuration directory. Default: [.filename]#${LOCALBASE}/etc/apache24# `APACHEINCLUDEDIR`:: Location of the Apache include files Default: [.filename]#${LOCALBASE}/include/apache24# `APACHEMODDIR`:: Location of the Apache modules Default: [.filename]#${LOCALBASE}/libxexec/apache24# `APACHE_DEFAULT`::Default Apache version [[uses-autoreconf]] == `autoreconf` Possible arguments: (none), `build` Runs `autoreconf`. It encapsulates the `aclocal`, `autoconf`, `autoheader`, `automake`, `autopoint`, and `libtoolize` commands. Each command applies to [.filename]#${AUTORECONF_WRKSRC}/configure.ac# or its old name, [.filename]#${AUTORECONF_WRKSRC}/configure.in#. If [.filename]#configure.ac# defines subdirectories with their own [.filename]#configure.ac# using `AC_CONFIG_SUBDIRS`, `autoreconf` will recursively update those as well. The `:build` argument only adds build time dependencies on those tools but does not run `autoreconf`. A port can set `AUTORECONF_WRKSRC` if `WRKSRC` does not contain the path to [.filename]#configure.ac#. [[uses-azurepy]] == `azurepy` Possible arguments: (none) Provide support for `py-azure*` ports. Removes `azure` namespaces and cleans up common files. [[uses-blaslapack]] == `blaslapack` Possible arguments: (none), `atlas`, `netlib` (default), `gotoblas`, `openblas` Adds dependencies on Blas / Lapack libraries. [[uses-bdb]] == `bdb` Possible arguments: (none), `5` (default), `18` Add dependency on the Berkeley DB library. Default to package:databases/db5[]. It can also depend on package:databases/db18[] when using the `:18` argument. It is possible to declare a range of acceptable values, `:5+` finds the highest installed version, and falls back to 5 if nothing else is installed. `INVALID_BDB_VER` can be used to specify versions which do not work with this port. The framework exposes the following variables to the port: `BDB_LIB_NAME`:: The name of the Berkeley DB library. For example, when using package:databases/db5[], it contains `db-5.3`. `BDB_LIB_CXX_NAME`:: The name of the Berkeley DBC++ library. For example, when using package:databases/db5[], it contains `db_cxx-5.3`. `BDB_INCLUDE_DIR`:: The location of the Berkeley DB include directory. For example, when using package:databases/db5[], it will contain `${LOCALBASE}/include/db5`. `BDB_LIB_DIR`:: The location of the Berkeley DB library directory. For example, when using package:databases/db5[], it contains `${LOCALBASE}/lib`. `BDB_VER`:: The detected Berkeley DB version. For example, if using `USES=bdb:5+` and Berkeley DB 18 is installed, it contains `18`. [IMPORTANT] ==== package:databases/db48[] is deprecated and unsupported. It must not be used by any port. ==== [[uses-bison]] == `bison` Possible arguments: (none), `build`, `run`, `both` Uses package:devel/bison[] By default, with no arguments or with the `build` argument, it implies `bison` is a build-time dependency, `run` implies a run-time dependency, and `both` implies both run-time and build-time dependencies. [[uses-budgie]] == `budgie` Possible arguments: (none) Provide support for the Budgie desktop environment. Use `USE_BUDGIE` to select the components needed for the port. See crossref:special[using-budgie,Using Budgie] for more information. [[uses-cabal]] == `cabal` [IMPORTANT] ==== Ports should not be created for Haskell libraries, see crossref:special[haskell-libs,Haskell Libraries] for more information. ==== Possible arguments: (none), `hpack`, `nodefault` Sets default values and targets used to build Haskell software using Cabal. A build dependency on the Haskell compiler port (package:lang/ghc[]) is added. If there is some other version of GHC already listed in the `BUILD_DEPENDS` variable (for example, package:lang/ghc810[]), it would be used instead. If the `hpack` argument is given, a build dependency on package:devel/hs-hpack[] is added and `hpack` is invoked at configuration step to generate .cabal file. If the `nodefault` argument is given, the framework will not try to pull the main distribution file from the Hackage. This argument is implicitly added if `USE_GITHUB` or `USE_GITLAB` is present. The framework provides the following variables: `CABAL_REVISION`:: Haskell packages hosted on Hackage may have revisions. Set this knob to an integer number to pull in revised package description. `USE_CABAL`:: If the software uses Haskell dependencies, list them in this variable. Each item should be present on Hackage and be listed in form `packagename-_0.1.2_`. Dependencies can have revisions too, which are specified after the `_` symbol. Automatic generation of the dependency list is supported, see crossref:special[using-cabal,Building Haskell Applications with `cabal`]. `CABAL_FLAGS`:: List of flags to be passed to `cabal-install` during the configuring and building stage. The flags are passed verbatim. This variable is usually used to enable or disable flags that are declared in the .cabal file. Pass `foo` to enable the `foo` flag and `-foo` to disable it. `CABAL_EXECUTABLES`:: List of executable files installed by the port. Default value: `${PORTNAME}`. Consult the .cabal file of the project being ported to get a list of possible values for this variable. Each value corresponds to an `executable` stanza in the .cabal file. Items from this list are automatically added to pkg-plist. `SKIP_CABAL_PLIST`:: If defined, do not add items from `${CABAL_EXECUTABLES}` to pkg-plist. `opt_USE_CABAL`:: Adds items to `${USE_CABAL}` depending on `opt` option. `opt_CABAL_EXECUTABLES`:: Adds items to `${CABAL_EXECUTABLES}` depending on `opt` option. `opt_CABAL_FLAGS`:: If `opt` is enabled, append the value to `${CABAL_FLAGS}`. Otherwise, append `-value` to disable the flag. Note that this behavior is slightly different from the plain `CABAL_FLAGS` as it does not accept values starting with `-`. `CABAL_WRAPPER_SCRIPTS`:: A subset of `${CABAL_EXECUTABLES}` containing Haskell programs to be wrapped into a shell script that sets `*_datadir` environment variables before running the program. This also causes the actual Haskell binary to be installed under `libexec/cabal/` directory. This knob is needed for Haskell programs that install their data files under `share/` directory. `FOO_DATADIR_VARS`:: List of extra Haskell packages, whose data files should be accessible by the executable named `FOO`. The executable should be a part of `${CABAL_WRAPPER_SCRIPTS}`. Haskell packages listed there should not have a version suffix. `CABAL_PROJECT`:: Some Haskell projects may already have a `cabal.project` file, which is also generated by the ports framework. If that is the case, use this variable to specify what to do with the original `cabal.project`. Setting this variable to `remove` will cause the original file to be removed. Setting this variable to `append` will: . Move the original file to `cabal.project.${PORTNAME}` during the `extract` stage. . Concatenate the original `cabal.project.${PORTNAME}` and the generated `cabal.project` into a single file after the `patch` stage. Using `append` makes it possible to perform patching on the original file before it gets merged. [[uses-cargo]] == `cargo` Possible arguments: (none) Uses Cargo for configuring, building, and testing. It can be used to port Rust applications that use the Cargo build system. For more information see crossref:special[using-cargo,Building Rust Applications with `cargo`]. [[uses-charsetfix]] == `charsetfix` Possible arguments: (none) Prevents the port from installing [.filename]#charset.alias#. This must be installed only by package:converters/libiconv[]. `CHARSETFIX_MAKEFILEIN` can be set to a path relative to `WRKSRC` if [.filename]#charset.alias# is not installed by [.filename]#${WRKSRC}/Makefile.in#. [[uses-cl]] == `cl` Possible arguments: (none) Provides support for Common Lisp ports. The framework provides the following variables that can be set by ports: `ASDF_MODULES`:: List of `ASDF` modules to build when `FASL_TARGET` is set (defaults to `PORTNAME`) `FASL_TARGET`:: Build fasl variant of port (one of `ccl`, `clisp`, or `sbcl`) `USE_ASDF`:: Depend on package:devel/cl-asdf[] `USE_ASDF_FASL`:: Depend on `devel/cl-asdf-` `USE_CCL`:: Depend on package:lang/ccl[]; implied when `FASL_TARGET=ccl` `USE_CLISP`:: Depend on package:lang/clisp[]; implied when `FASL_TARGET=clisp` `USE_SBCL`:: Depend on package:lang/sbcl[]; implied when `FASL_TARGET=SBCL` The framework provides the following variables that can be read by ports: `ASDF_PATHNAME`:: Path to CL source `ASDF_REGISTRY`:: Path to CL registry containing asd files `CCL`:: Path to the Clozure Common Lisp compiler `CLISP`:: Path to the GNU Common Lisp compiler `CL_LIBDIR_REL`:: CL library directory relative to `LOCALBASE` or `PREFIX` `FASL_DIR_REL`:: Relative path to compiled fasl files; depends on `FASL_TARGET` `FASL_PATHNAME`:: Path to CL fasl `LISP_EXTRA_ARG`:: Extra arguments used when building fasl `SBCL`:: Path to the Steel Bank Common Lisp compiler [[uses-cmake]] == `cmake` Possible arguments: (none), `insource`, `noninja`, `run`, `testing` Use CMake for configuring the port and generating a build system. By default an out-of-source build is performed, leaving the sources in `WRKSRC` free from build artifacts. With the `insource` argument, an in-source build will be performed instead. This argument should be an exception, used only when a regular out-of-source build does not work. By default Ninja (package:devel/ninja[]) is used for the build. In some cases this does not work correctly. With the `noninja` argument, the build will use regular `make` for builds. This argument should only be used if a Ninja-based build does not work. With the `run` argument, a run dependency is registered in addition to a build dependency. With the `testing` argument, a test-target is added that uses CTest. When running tests the port will be re-configured for testing and re-built. For more information see crossref:special[using-cmake,Using `cmake`]. [[uses-compiler]] == `compiler` Possible arguments: (none), `env` (default, implicit), `{cpp}17-lang`, `{cpp}14-lang`, `{cpp}11-lang`, `gcc-{cpp}11-lib`, `{cpp}11-lib`, `{cpp}0x`, `c11`, `nestedfct`, `features` Determines which compiler to use based on any given wishes. Use `{cpp}17-lang` if the port needs a {cpp}17-capable compiler, `{cpp}14-lang` if the port needs a {cpp}14-capable compiler, `{cpp}11-lang` if the port needs a {cpp}11-capable compiler, `gcc-{cpp}11-lib` if the port needs the `g++` compiler with a {cpp}11 library, or `{cpp}11-lib` if the port needs a {cpp}11-ready standard library. If the port needs a compiler understanding {cpp}0X, C11 or nested functions, the corresponding parameters should be used. Use `features` to request a list of features supported by the default compiler. After including [.filename]#bsd.port.pre.mk# the port can inspect the results using these variables: * `COMPILER_TYPE`: the default compiler on the system, either gcc or clang * `ALT_COMPILER_TYPE`: the alternative compiler on the system, either gcc or clang. Only set if two compilers are present in the base system. * `COMPILER_VERSION`: the first two digits of the version of the default compiler. * `ALT_COMPILER_VERSION`: the first two digits of the version of the alternative compiler, if present. * `CHOSEN_COMPILER_TYPE`: the chosen compiler, either gcc or clang * `COMPILER_FEATURES`: the features supported by the default compiler. It currently lists the {cpp} library. [[uses-cpe]] == `cpe` Possible arguments: (none) Include Common Platform Enumeration (CPE) information in package manifest as a CPE 2.3 formatted string. See the https://scap.nist.gov/specifications/cpe/[CPE specification] for details. To add CPE information to a port, follow these steps: [.procedure] . Search for the official CPE entry for the software product either by using the NVD's https://web.nvd.nist.gov/view/cpe/search[CPE search engine] or in the https://nvd.nist.gov/feeds/xml/cpe/dictionary/official-cpe-dictionary_v2.3.xml.gz[official CPE dictionary] (warning, very large XML file). _Do not ever make up CPE data._ . Add `cpe` to `USES` and compare the result of `make -V CPE_STR` to the CPE dictionary entry. Continue one step at a time until `make -V CPE_STR` is correct. . If the product name (second field, defaults to `PORTNAME`) is incorrect, define `CPE_PRODUCT`. . If the vendor name (first field, defaults to `CPE_PRODUCT`) is incorrect, define `CPE_VENDOR`. . If the version field (third field, defaults to `PORTVERSION`) is incorrect, define `CPE_VERSION`. . If the update field (fourth field, defaults to empty) is incorrect, define `CPE_UPDATE`. . If it is still not correct, check [.filename]#Mk/Uses/cpe.mk# for additional details, or contact the {ports-secteam}. . Derive as much as possible of the CPE name from existing variables such as `PORTNAME` and `PORTVERSION`. Use variable modifiers to extract the relevant portions from these variables rather than hardcoding the name. . _Always_ run `make -V CPE_STR` and check the output before committing anything that changes `PORTNAME` or `PORTVERSION` or any other variable which is used to derive `CPE_STR`. [[uses-cran]] == `cran` Possible arguments: (none), `auto-plist`, `compiles` Uses the Comprehensive R Archive Network. Specify `auto-plist` to automatically generate [.filename]#pkg-plist#. Specify `compiles` if the port has code that need to be compiled. [[uses-desktop-file-utils]] == `desktop-file-utils` Possible arguments: (none) Uses update-desktop-database from package:devel/desktop-file-utils[]. An extra post-install step will be run without interfering with any post-install steps already in the port [.filename]#Makefile#. A line with crossref:plist[plist-keywords-desktop-file-utils,`@desktop-file-utils`] will be added to the plist. Only use this macro if the port provides a `.desktop` file which contains a `MimeType` entry. [[uses-desthack]] == `desthack` Possible arguments: (none) Changes the behavior of GNU configure to properly support `DESTDIR` in case the original software does not. [[uses-display]] == `display` Possible arguments: (none), _ARGS_ Set up a virtual display environment. If the environment variable `DISPLAY` is not set, then Xvfb is added as a build dependency, and `CONFIGURE_ENV` is extended with the port number of the currently running instance of Xvfb. The _ARGS_ parameter defaults to `install` and controls the phase around which to start and stop the virtual display. [[uses-dos2unix]] == `dos2unix` Possible arguments: (none) The port has files with line endings in DOS format which need to be converted. Several variables can be set to control which files will be converted. The default is to convert _all_ files, including binaries. See crossref:slow-porting[slow-patch-automatic-replacements,Simple Automatic Replacements] for examples. * `DOS2UNIX_REGEX`: match file names based on a regular expression. * `DOS2UNIX_FILES`: match literal file names. * `DOS2UNIX_GLOB`: match file names based on a glob pattern. * `DOS2UNIX_WRKSRC`: the directory from which to start the conversions. Defaults to `${WRKSRC}`. [[uses-drupal]] == `drupal` Possible arguments: `7`, `module`, `theme` Automate installation of a port that is a Drupal theme or module. Use with the version of Drupal that the port is expecting. For example, `USES=drupal:7,module` says that this port creates a Drupal 7 module. A Drupal 7 theme can be specified with `USES=drupal:7,theme`. [[uses-ebur128]] == `ebur128` Possible arguments: (none), `build`, `lib`, `run`, `test` Adds a dependency on package:audio/ebur128[]. It allows to transparently depend on the `rust` or `legacy` variants by using `DEFAULT_VERSIONS` in [.filename]#make.conf#. For instance, to use the legacy version, use `DEFAULT_VERSIONS+=ebur128=legacy` When no arguments are used, the behavior is the same as if the `lib` argument was provided. The rest of the arguments provide the corresponding category of dependency. [[uses-eigen]] == `eigen` Possible arguments: 2, 3, build (default), run Add dependency on package:math/eigen[]. [[uses-elextronfix]] == `electronfix` Possible arguments: `31`, `32`, `33` Provide support for easy porting of Electron applications that are distributed in binary form. Adds a build and run time dependency on package:devel/electron31[], package:devel/electron32[], or package:devel/electron33[] depending on the argument used. The framework provides the following variables that can be set by ports: `ELECTRONFIX_SYMLINK_FILES`:: List of files to be symlinked from Electron distribution. `ELECTRONFIX_MAIN_EXECUTABLE`:: File name of the main executable to be replaced with the original Electron binary. [[uses-elfctl]] == `elfctl` Possible arguments: (none), `build` (default), `stage` Set ELF binary feature control notes by setting `ELF_FEATURES`. When either no argument or the `build` argument is supplied, binaries under `BUILD_WRKSRC` are operated on, and files listed in `ELF_FEATURES` are relative to `BUILD_WRKSRC`. When the `stage` argument is supplied, binaries under `STAGEDIR` are operated on and files listed in `ELF_FEATURES` are relative to `STAGEDIR`. [[uses-elfct-ex1]] .Uses=elfctl [example] ==== [.programlisting] .... ELF_FEATURES= featurelist:path/to/file1 \ featurelist:path/to/file2 .... ==== The format of `featurelist` is described in man:elfctl[1]. [[uses-elixir]] == `elixir` Possible arguments: (none) Provide support for ports using package:lang/elixir[]. Adds a build and run time dependency on package:lang/elixir[]. Variables provided by the framework: `ELIXIR_APP_NAME`:: Elixir app name as installed in Elixir's lib directory `ELIXIR_LIB_ROOT`:: Elixir default library path `ELIXIR_APP_ROOT`:: Root directory for this Elixir app `ELIXIR_HIDDEN`:: Applications to be hidden from the code path; usually ${PORTNAME} `ELIXIR_LOCALE`:: An UTF-8 locale to be used by Elixir during builds (any UTF-8 locale is good) `MIX_CMD`:: The `mix` command `MIX_COMPILE`:: The `mix` command used to compile an Elixir app `MIX_REWRITE`:: Automatically replace Mix dependencies with code paths `MIX_BUILD_DEPS`:: List of `BUILD_DEPENDS` in category/portname format (commonly referenced to as "deps" in Erlang and Elixir) `MIX_RUN_DEPS`:: List of `RUN_DEPENDS` in category/portname format `MIX_DOC_DIRS`:: Extra doc directories to be installed in `DOCSDIR` `MIX_DOC_FILES`:: Extra doc files to be installed in `DOCSDIR` (usually README.md) `MIX_ENV`:: Environment for the Mix build (same format as `MAKE_ENV`) `MIX_ENV_NAME`:: Name of the Mix build environment, usually "prod" `MIX_BUILD_NAME`:: Name of the build output in _build/, usually `${MIX_ENV_NAME}` `MIX_TARGET`:: Name of the Mix target, usually "compile" `MIX_EXTRA_APPS`:: List of sub-applications to be built, if any `MIX_EXTRA_DIRS`:: List of extra directories to be installed in `ELIXIR_APP_ROOT` `MIX_EXTRA_FILES`:: List of extra files to be installed in `ELIXIR_APP_ROOT` [[uses-emacs]] == `emacs` Possible arguments: (none) (default), `build`, `run`, `noflavors` Provides support for ports requiring Emacs. The `build` argument creates a build dependency on Emacs. The `run` argument creates a run dependency on Emacs. If both the `build` and `run` arguments are absent, create build and run dependencies on Emacs. The `noflavors` argument prevents flavors, and is implied if there is no run dependency on Emacs. The default Emacs flavor for ports with `USES=emacs` can be defined in [.filename]#make.conf#. For example, for the `nox` flavor, use `DEFAULT_VERSIONS+= emacs=nox`. The valid flavors are: `full`, `canna`, `nox`, `wayland`, `devel_full`, `devel_nox`. Variables, which can be set by ports: `EMACS_FLAVORS_EXCLUDE`:: Do NOT build these Emacs flavors. If `EMACS_FLAVORS_EXCLUDE` is not defined and: * there is a run dependency on Emacs * the noflavors argument is not specified + then all valid Emacs flavors are assumed. `EMACS_NO_DEPENDS`:: Do NOT add build or run dependencies on Emacs. This will prevent flavors, and no byte code files will be generated as part of the package. Variables, which can be read by ports: `EMACS_CMD`:: Emacs command with full path (e.g. [.filename]#/usr/local/bin/emacs-30.1#) `EMACS_FLAVOR`:: Used for dependencies (e.g. `BUILD_DEPENDS= dash.el${EMACS_PKGNAMESUFFIX}>0:devel/dash@${EMACS_FLAVOR}`) `EMACS_LIBDIR`:: Emacs Library directory without `${PREFIX}` (e.g. [.filename]#share/emacs#) `EMACS_LIBDIR_WITH_VER`:: Library directory without `${PREFIX}` including version (e.g. [.filename]#share/emacs/30.1#) `EMACS_MAJOR_VER`:: Emacs major version (e.g. 30) `EMACS_PKGNAMESUFFIX`:: `PKGNAMESUFFIX` to distinguish Emacs flavors `EMACS_SITE_LISPDIR`:: Emacs site-lisp directory without `${PREFIX}` (e.g. [.filename]#share/emacs/site-lisp#) `EMACS_VER`:: Emacs version (e.g. 30.1) `EMACS_VERSION_SITE_LISPDIR`:: Include version (e.g. [.filename]#share/emacs/30.1/site-lisp#) [[uses-erlang]] == `erlang` Possible arguments: (none), `enc`, `rebar`, `rebar3` Adds a build and run time dependency on package:lang/erlang[]. Depending on the argument, it adds additional build dependencies. `enc` adds a dependency on package:devel/erlang-native-compiler[], `rebar` adds a dependency on package:devel/rebar[] and `rebar3` adds a dependency on package:devel/rebar3[]. In addition, the following variables are available to the port: * `ERL_APP_NAME`: Erlang app name as installed in Erlang's lib dir (minus version) * `ERL_APP_ROOT`: Root directory for this Erlang app * `REBAR_CMD`: Path to the "rebar" command * `REBAR3_CMD`: Path to the "rebar3" command * `REBAR_PROFILE`: Rebar profile * `REBAR_TARGETS`: Rebar target list (usually compile, maybe escriptize) * `ERL_BUILD_NAME`: Build name for rebar3 * `ERL_BUILD_DEPS`: List of BUILD_DEPENDS in category/portname format * `ERL_RUN_DEPS`: List of RUN_DEPENDS in category/portname format * `ERL_DOCS`: List of documentation files and directories [[uses-fakeroot]] == `fakeroot` Possible arguments: (none) Changes some default behavior of build systems to allow installing as a user. See https://wiki.debian.org/FakeRoot[] for more information on `fakeroot`. [[uses-fam]] == `fam` Possible arguments: (none), `fam`, `gamin` Uses a File Alteration Monitor as a library dependency, either package:devel/fam[] or package:devel/gamin[]. End users can set WITH_FAM_SYSTEM to specify their preference. [[uses-firebird]] == `firebird` Possible arguments: (none), `25` Add a dependency to the client library of the Firebird database. [[uses-fonts]] == `fonts` Possible arguments: (none), `fc`, `fontsdir` (default), `none` Adds a runtime dependency on tools needed to register fonts. -Depending on the argument, add a `crossref:plist[plist-keywords-fc,`@fc`] -${FONTSDIR}` line, `crossref:plist[plist-keywords-fontsdir,`@fontsdir`] ${FONTSDIR}` line, or no line if the argument is `none`, to the plist. +Depending on the argument, add a `crossref:plist[plist-keywords-fc,@fc] ${FONTSDIR}` line, +`crossref:plist[plist-keywords-fontsdir,@fontsdir] ${FONTSDIR}` line, or no line if the argument is `none`, to the plist. `FONTSDIR` defaults to [.filename]#${PREFIX}/share/fonts/${FONTNAME}# and `FONTNAME` to `${PORTNAME}`. Add `FONTSDIR` to `PLIST_SUB` and `SUB_LIST` [[uses-fortran]] == `fortran` Possible arguments: `gcc` (default) Uses the GNU Fortran compiler. [[uses-fpc]] == `fpc` Possible arguments: (none), `run` Provide support for Free Pascal based ports. It will install Free Pascal compiler and units. Adds a build dependency on package:lang/fpc[]. If the `run` argument is given a run dependency is also added. [[uses-fuse]] == `fuse` Possible arguments: `2` (default), `3` The port will depend on the FUSE library and handle the dependency on the kernel module depending on the version of FreeBSD. [[uses-gem]] == `gem` Possible arguments: (none), `noautoplist` Handle building with RubyGems. If `noautoplist` is used, the packing list is not generated automatically. This implies `USES=ruby`. [[uses-gettext]] == `gettext` Possible arguments: (none) Deprecated. Will include both crossref:uses[uses-gettext-runtime,`gettext-runtime`] and crossref:uses[uses-gettext-tools,`gettext-tools`]. [[uses-gettext-runtime]] == `gettext-runtime` Possible arguments: (none), `lib` (default), `build`, `run` Uses package:devel/gettext-runtime[]. By default, with no arguments or with the `lib` argument, implies a library dependency on [.filename]#libintl.so#. `build` and `run` implies, respectively a build-time and a run-time dependency on [.filename]#gettext#. [[uses-gettext-tools]] == `gettext-tools` Possible arguments: (none), `build` (default), `run` Uses package:devel/gettext-tools[]. By default, with no argument, or with the `build` argument, a build time dependency on [.filename]#msgfmt# is registered. With the `run` argument, a run-time dependency is registered. [[uses-ghostscript]] == `ghostscript` Possible arguments: _X_, `build`, `run`, `nox11` A specific version _X_ can be used. Possible versions are `7`, `8`, `9`, and `agpl` (default). `nox11` indicates that the `-nox11` version of the port is required. `build` and `run` add build- and run-time dependencies on Ghostscript. The default is both build- and run-time dependencies. [[uses-gl]] == `gl` Possible arguments: (none) Provides an easy way to depend on GL components. The components should be listed in `USE_GL`. The available components are: `egl`:: add a library dependency on [.filename]#libEGL.so# from package:graphics/libglvnd[] `gbm`:: Add a library dependency on [.filename]#libgbm.so# from package:graphics/mesa-libs[] `gl`:: Add a library dependency on [.filename]#libGL.so# from package:graphics/libglvnd[] `glesv2`:: Add a library dependency on [.filename]#libGLESv2.so# from package:graphics/libglvnd[] `glew`:: Add a library dependency on [.filename]#libGLEW.so# from package:graphics/glew[] `glu`:: Add a library dependency on [.filename]#libGLU.so# from package:graphics/libGLU[] `glut`:: Add a library dependency on [.filename]#libglut.so# from package:graphics/freeglut[] `opengl`:: Add a library dependency on [.filename]#libOpenGL.so# from package:graphics/libglvnd[] [[uses-gmake]] == `gmake` Possible arguments: (none) Uses package:devel/gmake[] as a build-time dependency and sets up the environment to use `gmake` as the default `make` for the build. [[uses-gnome]] == `gnome` Possible arguments: (none) Provides an easy way to depend on GNOME components. The components should be listed in `USE_GNOME`. The available components are: * `atk` * `atkmm` * `cairo` * `cairomm` * `dconf` * `esound` * `evolutiondataserver3` * `gconf2` * `gconfmm26` * `gdkpixbuf` * `gdkpixbuf2` * `glib12` * `glib20` * `glibmm` * `gnomecontrolcenter3` * `gnomedesktop3` * `gnomedesktop4` * `gnomedocutils` * `gnomemenus3` * `gnomemimedata` * `gnomeprefix` * `gnomesharp20` * `gnomevfs2` * `gsound` * `gtk-update-icon-cache` * `gtk12` * `gtk20` * `gtk30` * `gtkhtml3` * `gtkhtml4` * `gtkmm20` * `gtkmm24` * `gtkmm30` * `gtksharp20` * `gtksourceview` * `gtksourceview2` * `gtksourceview3` * `gtksourceviewmm3` * `gvfs` * `intlhack` * `intltool` * `introspection` * `libartlgpl2` * `libbonobo` * `libbonoboui` * `libgda5` * `libgda5-ui` * `libgdamm5` * `libglade2` * `libgnome` * `libgnomecanvas` * `libgnomekbd` * `libgnomeprint` * `libgnomeprintui` * `libgnomeui` * `libgsf` * `libgtkhtml` * `libgtksourceviewmm` * `libidl` * `librsvg2` * `libsigc++12` * `libsigc++20` * `libwnck` * `libwnck3` * `libxml++26` * `libxml2` * `libxslt` * `metacity` * `nautilus3` * `orbit2` * `pango` * `pangomm` * `pangox-compat` * `py3gobject3` * `pygnome2` * `pygobject` * `pygobject3` * `pygtk2` * `pygtksourceview` * `referencehack` * `vte` * `vte3` The default dependency is build- and run-time, it can be changed with `:build` or `:run`. For example: [.programlisting] .... USES= gnome USE_GNOME= gnomemenus3:build intlhack .... See crossref:special[using-gnome,Using GNOME] for more information. [[uses-go]] == `go` [IMPORTANT] ==== Ports should not be created for Go libs, see crossref:special[go-libs,Go Libraries] for more information. ==== Possible arguments: (none), `N.NN`, `N.NN-devel`, `modules`, `no_targets`, `run` Sets default values and targets used to build Go software. A build dependency on the Go compiler port is added, port maintainers can set version required. By default the build is performed in GOPATH mode. If Go software uses modules, the modules-aware mode can be switched on with `modules` argument. `no_targets` will setup build environment like `GO_ENV`, `GO_BUILDFLAGS` but skip creating extract and build targets. `run` will also add a run dependency on the Go compiler port. The build process is controlled by several variables: `GO_MODULE`:: The name of the application module as specified by the `module` directive in `go.mod`. In most cases, this is the only required variable for ports that use Go modules. `GO_PKGNAME`:: The name of the Go package when building in GOPATH mode. This is the directory that will be created in `${GOPATH}/src`. If not set explicitly and `GH_SUBDIR` or `GL_SUBDIR` is present, `GO_PKGNAME` will be inferred from it. It is not needed when building in modules-aware mode. `GO_TARGET`:: The packages to build. The default value is `${GO_PKGNAME}`. `GO_TARGET` can also be a tuple in the form `package:path` where path can be either a simple filename or a full path starting with `${PREFIX}`. `GO_TESTTARGET`:: The packages to test. The default value is `./...` (the current package and all subpackages). `CGO_CFLAGS`:: Additional `CFLAGS` values to be passed to the C compiler by `go`. `CGO_LDFLAGS`:: Additional `LDFLAGS` values to be passed to the C compiler by `go`. `GO_BUILDFLAGS`:: Additional build arguments to be passed to `go build`. `GO_TESTFLAGS`:: Additional build arguments to be passed to `go test`. See crossref:special[using-go,Building Go Applications] for usage examples. [[uses-gperf]] == `gperf` Possible arguments: (none) Add a buildtime dependency on package:devel/gperf[] if `gperf` is not present in the base system. [[uses-grantlee]] == `grantlee` Possible arguments: `5`, `selfbuild` Handle dependency on Grantlee. Specify `5` to depend on the Qt5 based version, package:devel/grantlee5[]. `selfbuild` is used internally by package:devel/grantlee5[] to get their versions numbers. [[uses-groff]] == `groff` Possible arguments: `build`, `run`, `both` Registers a dependency on package:textproc/groff[] if not present in the base system. [[uses-gssapi]] == `gssapi` Possible arguments: (none), `base` (default), `heimdal`, `mit`, `flags`, `bootstrap` Handle dependencies needed by consumers of the GSS-API. Only libraries that provide the Kerberos mechanism are available. By default, or set to `base`, the GSS-API library from the base system is used. Can also be set to `heimdal` to use package:security/heimdal[], or `mit` to use package:security/krb5[]. When the local Kerberos installation is not in `LOCALBASE`, set `HEIMDAL_HOME` (for `heimdal`) or `KRB5_HOME` (for `krb5`) to the location of the Kerberos installation. These variables are exported for the ports to use: * `GSSAPIBASEDIR` * `GSSAPICPPFLAGS` * `GSSAPIINCDIR` * `GSSAPILDFLAGS` * `GSSAPILIBDIR` * `GSSAPILIBS` * `GSSAPI_CONFIGURE_ARGS` The `flags` option can be given alongside `base`, `heimdal`, or `mit` to automatically add `GSSAPICPPFLAGS`, `GSSAPILDFLAGS`, and `GSSAPILIBS` to `CFLAGS`, `LDFLAGS`, and `LDADD`, respectively. For example, use `base,flags`. The `bootstrap` option is a special prefix only for use by package:security/krb5[] and package:security/heimdal[]. For example, use `bootstrap,mit`. [[uses-gssapi-ex1]] .Typical Use [example] ==== [.programlisting] .... OPTIONS_SINGLE= GSSAPI OPTIONS_SINGLE_GSSAPI= GSSAPI_BASE GSSAPI_HEIMDAL GSSAPI_MIT GSSAPI_NONE GSSAPI_BASE_USES= gssapi GSSAPI_BASE_CONFIGURE_ON= --with-gssapi=${GSSAPIBASEDIR} ${GSSAPI_CONFIGURE_ARGS} GSSAPI_HEIMDAL_USES= gssapi:heimdal GSSAPI_HEIMDAL_CONFIGURE_ON= --with-gssapi=${GSSAPIBASEDIR} ${GSSAPI_CONFIGURE_ARGS} GSSAPI_MIT_USES= gssapi:mit GSSAPI_MIT_CONFIGURE_ON= --with-gssapi=${GSSAPIBASEDIR} ${GSSAPI_CONFIGURE_ARGS} GSSAPI_NONE_CONFIGURE_ON= --without-gssapi .... ==== [[uses-gstreamer]] == `gstreamer` Possible arguments: (none) Provides an easy way to depend on GStreamer components. The components should be listed in `USE_GSTREAMER`. The available components are: * `a52dec` * `aalib` * `amrnb` * `amrwbdec` * `aom` * `assrender` * `bad` * `bs2b` * `cairo` * `cdio` * `cdparanoia` * `chromaprint` * `curl` * `dash` * `dtls` * `dts` * `dv` * `dvd` * `dvdread` * `editing-services` * `faac` * `faad` * `flac` * `flite` * `gdkpixbuf` * `gl` * `gme` * `gnonlin` * `good` * `gsm` * `gtk4` * `gtk` * `hal` * `hls` * `jack` * `jpeg` * `kate` * `kms` * `ladspa` * `lame` * `libav` * `libcaca` * `libde265` * `libmms` * `libvisual` * `lv2` * `mm` * `modplug` * `mpeg2dec` * `mpeg2enc` * `mpg123` * `mplex` * `musepack` * `neon` * `ogg` * `opencv` * `openexr` * `openh264` * `openjpeg` * `openmpt` * `opus` * `pango` * `png` * `pulse` * `qt` * `resindvd` * `rsvg` * `rtmp` * `shout2` * `sidplay` * `smoothstreaming` * `sndfile` * `sndio` * `soundtouch` * `soup` * `spandsp` * `speex` * `srtp` * `taglib` * `theora` * `ttml` * `twolame` * `ugly` * `v4l2` * `vorbis` * `vpx` * `vulkan` * `wavpack` * `webp` * `webrtcdsp` * `x264` * `x265` * `x` * `ximagesrc` * `zbar` [[uses-guile]] == `guile` Possible arguments: (none), `_X.Y_`, `flavors`, `build`, `run`, `alias`, `conflicts` Adds a dependency on Guile. By default this is a library dependency on the appropriate `libguile*.so`, unless overridden by the `build` and/or `run` option. The `alias` option configures `BINARY_ALIAS` appropriately (see crossref:makefiles[binary-alias,Use `BINARY_ALIAS`]). The default version is set by the usual `DEFAULT_VERSIONS` mechanism; if the default version is not one of the listed versions, then the latest available listed version is used. Applications using Guile are normally built for only a single Guile version. However, extension or library modules should use the `flavors` option to build with multiple flavors. For more information see crossref:special[using-guile,Using Guile]. [[uses-horde]] == `horde` Possible arguments: (none) Add buildtime and runtime dependencies on package:devel/pear-channel-horde[]. Other Horde dependencies can be added with `USE_HORDE_BUILD` and `USE_HORDE_RUN`. See crossref:special[php-horde,Horde Modules] for more information. [[uses-iconv]] == `iconv` Possible arguments: (none), `lib`, `build`, `patch`, `translit`, `wchar_t` Uses `iconv` functions, either from the port package:converters/libiconv[] as a build-time and run-time dependency, or from the base system. By default, with no arguments or with the `lib` argument, implies `iconv` with build-time and run-time dependencies. `build` implies a build-time dependency, and `patch` implies a patch-time dependency. If the port uses the `WCHAR_T` or `//TRANSLIT` iconv extensions, add the relevant arguments so that the correct iconv is used. For more information see crossref:special[using-iconv,Using `iconv`]. [[uses-imake]] == `imake` Possible arguments: (none), `env`, `notall`, `noman` Add package:devel/imake[] as a build-time dependency and run `xmkmf -a` during the `configure` stage. If the `env` argument is given, the `configure` target is not set. If the `-a` flag is a problem for the port, add the `notall` argument. If `xmkmf` does not generate a `install.man` target, add the `noman` argument. [[uses-java]] == `java` Possible arguments: (none), `ant`, `build`, `extract`, `run` Defaults to `USES=java:build,run` if no arguments are provided and `NO_BUILD` is undefined. If `NO_BUILD` is defined, `USES=java:run` is used. If the `ant` argument is given, the port uses Apache Ant. If the `build` argument is given, a JDK port is added to the build dependencies. If the `extract` argument is given, a JDK port is added to the extract dependencies. If the `run` argument is given, a JDK port is added to the run dependencies. The framework provides the following variables to be set by the port: `JAVA_VERSION`:: List of space-separated suitable java versions for the port. An optional `\+` allows specifying a range of versions. (allowed values `8[+]`, `11[\+]`, `17[+]`, `18[\+]`, `19[+]`, `20[\+]`, `21[+]`, `22[\+]`, `22[+]`) `JAVA_OS`:: List of space-separated suitable JDK port operating systems for the port. (allowed values: `native`, `linux`) `JAVA_VENDOR`:: List of space-separated suitable JDK port vendors for the port. (allowed values: `openjdk`, `oracle`) The framework exposes the following variables to be read by the port: `JAVA_PORT`:: The name of the JDK port. (e.g. 'java/openjdk8') `JAVA_PORT_VERSION`:: The version of the JDK port. (e.g. '8') `JAVA_PORT_OS`:: The operating system used by the JDK port. (e.g. 'linux') `JAVA_PORT_VENDOR`:: The vendor of the JDK port. (e.g. 'openjdk') `JAVA_PORT_OS_DESCRIPTION`:: Description of the operating system used by the JDK port. (e.g. 'Linux') `JAVA_PORT_VENDOR_DESCRIPTION`:: Description of the vendor of the JDK port. (e.g. 'OpenJDK BSD Porting Team') `JAVA_HOME`:: Path to the installation directory of the JDK. (e.g. [.filename]#/usr/local/openjdk8#) `JAVAC`:: Path to the Java compiler to use. (e.g. [.filename]#/usr/local/openjdk8/bin/javac# or [.filename]#/usr/local/bin/javac#) `JAR`:: Path to the JAR tool to use. (e.g. [.filename]#/usr/local/openjdk8/bin/jar# or [.filename]#/usr/local/bin/fastjar#) `APPLETVIEWER`:: Path to the appletviewer utility. (e.g. [.filename]#/usr/local/linux-jdk1.8.0/bin/appletviewer#) `JAVA`:: Path to the `java` executable. Use this for executing Java programs. (e.g. [.filename]#/usr/local/openjdk8/bin/java#) `JAVADOC`:: Path to the `javadoc` utility program. `JAVAH`:: Path to the `javah` program. `JAVAP`:: Path to the `javap` program. `JAVA_KEYTOOL`:: Path to the `keytool` utility program. `JAVA_N2A`:: Path to the `native2ascii` tool. `JAVA_POLICYTOOL`:: Path to the `policytool` program. `JAVA_SERIALVER`:: Path to the `serialver` utility program. `RMIC`:: Path to the RMI stub/skeleton generator, `rmic`. `RMIREGISTRY`:: Path to the RMI registry program, `rmiregistry`. `RMID`:: Path to the RMI daemon program. `JAVA_CLASSES`:: Path to the archive that contains the JDK class files. On most JDKs, this is [.filename]#${JAVA_HOME}/jre/lib/rt.jar#. `JAVASHAREDIR`:: The base directory for all shared Java resources. `JAVAJARDIR`:: The directory where a port should install JAR files. `JAVALIBDIR`:: The directory where JAR files installed by other ports are located. [[uses-jpeg]] == `jpeg` Possible arguments: `lib` (default, implicit), `build`, `run` Help handling dependencies on `jpeg`. If the `lib` argument is provided or no arguments are provided then a lib dependency is added to the port. If the `build` argument is provided then a build dependency is added to the port. If the `run` argument is provided then a run dependency is added to the port. If the `both` argument is provided then a build dependency and a run dependency are added to the port. The framework provides the following variable that can be set by ports: `JPEG_PORT`:: Specifies the JPEG implementation to use. Possible values are: * package:graphics/jpeg-turbo[] (default) * package:graphics/mozjpeg[] [[uses-kde]] == `kde` Possible arguments: `5` Add dependency on KDE components. See crossref:special[using-kde,Using KDE] for more information. [[uses-kmod]] == `kmod` Possible arguments: (none), `debug` Fills in the boilerplate for kernel module ports, currently: * Add `kld` to `CATEGORIES`. * Set `SSP_UNSAFE`. * Set `IGNORE` if the kernel sources are not found in `SRC_BASE`. * Define `KMODDIR` to [.filename]#/boot/modules# by default, add it to `PLIST_SUB` and `MAKE_ENV`, and create it upon installation. If `KMODDIR` is set to [.filename]#/boot/kernel#, it will be rewritten to [.filename]#/boot/modules#. This prevents breaking packages when upgrading the kernel due to [.filename]#/boot/kernel# being renamed to [.filename]#/boot/kernel.old# in the process. * Handle cross-referencing kernel modules upon installation and deinstallation, using crossref:plist[plist-keywords-kld,`@kld`]. * If the `debug` argument is given, the port can install a debug version of the module into [.filename]#KERN_DEBUGDIR#/[.filename]#KMODDIR#. By default, `KERN_DEBUGDIR` is copied from `DEBUGDIR` and set to [.filename]#/usr/lib/debug#. The framework will take care of creating and removing any required directories. [[uses-kodi]] == `kodi` Possible arguments: (none), `noautoplist` Provide support for package:multimedia/kodi[] add-ons. If the `noautoplist` argument is provided it does not generate the `plist` automatically. [[uses-lazarus]] == `lazarus` Possible arguments: (none), `gtk2` (default), `qt5`, `qt6`, `flavors` Provide support for package:editors/lazarus[] based ports. If no arguments are provided or if `gtk2` is provided the lazarus-app is built with a `gtk2` interface the package:editors/lazarus[] port will be built with the `gtk2` interface. If the `qt5` argument is provided, the lazarus-app is built with a `qt5` interface. If the `qt6` argument is provided, the lazarus-app is built with a `qt6` interface. If the `flavors` argument is provided the lazarus-app is built with flavors feature. If the port does not require compiling lazarus project files automatically, the following variable can be defined: `NO_LAZBUILD`= `yes` The following variables are available for ports: `LAZARUS_PROJECT_FILES`:: List of lpi files. It must not be empty. Default: empty `LAZARUS_DIR`:: Path to lazarus installation directory Default: [.filename]#${LOCALBASE}/share/lazarus-${LAZARUS_VER}# `LAZBUILD_ARGS`:: lazbuild extra args. It could be `-d` in most of cases. See man:lazbuild[1] for more information. Default: empty `LAZARUS_NO_FLAVORS`:: Do not build these lazarus flavors. If `LAZARUS_NO_FLAVORS` is not defined then all valid lazarus flavors are assumed. `WANT_LAZARUS_DEVEL`:: If set to `yes` then use package:lazarus/devel[] as build dependency. [[uses-ldap]] == `ldap` Possible arguments: (none), , client, server Registers a dependency on package:net/openldap[]. It uses the specific `` (without the dot notation) if set. Otherwise it tries to find the currently installed version. If necessary it falls back to the default version found in `bsd.default-versions.mk`. `client` specifies a runtime dependency on the client library. This is also the default. `server` specifies a runtime dependency on the server. The following variables can be accessed by the port: `IGNORE_WITH_OPENLDAP`:: This variable can be defined if the ports does not support one or more versions of OpenLDAP. `WITH_OPENLDAP_VER`:: User defined variable to set OpenLDAP version. `OPENLDAP_VER`:: Detected OpenLDAP version. [[uses-lha]] == `lha` Possible arguments: (none) Set `EXTRACT_SUFX` to `.lzh` [[uses-libarchive]] == `libarchive` Possible arguments: (none) Registers a dependency on package:archivers/libarchive[]. Any ports depending on libarchive must include `USES=libarchive`. [[uses-libedit]] == `libedit` Possible arguments: (none) Registers a dependency on package:devel/libedit[]. Any ports depending on libedit must include `USES=libedit`. [[uses-libtool]] == `libtool` Possible arguments: (none), `keepla`, `build` Patches `libtool` scripts. This must be added to all ports that use `libtool`. The `keepla` argument can be used to keep [.filename]#.la# files. Some ports do not ship with their own copy of libtool and need a build time dependency on package:devel/libtool[], use the `:build` argument to add such dependency. [[uses-linux]] == `linux` Possible arguments: `c6`, `c7` Ports Linux compatibility framework. Specify `c6` to depend on CentOS 6 packages. Specify `c7` to depend on CentOS 7 packages. The available packages are: * `allegro` * `alsa-plugins-oss` * `alsa-plugins-pulseaudio` * `alsalib` * `atk` * `avahi-libs` * `base` * `cairo` * `cups-libs` * `curl` * `cyrus-sasl2` * `dbusglib` * `dbuslibs` * `devtools` * `dri` * `expat` * `flac` * `fontconfig` * `gdkpixbuf2` * `gnutls` * `graphite2` * `gtk2` * `harfbuzz` * `jasper` * `jbigkit` * `jpeg` * `libasyncns` * `libaudiofile` * `libelf` * `libgcrypt` * `libgfortran` * `libgpg-error` * `libmng` * `libogg` * `libpciaccess` * `libsndfile` * `libsoup` * `libssh2` * `libtasn1` * `libthai` * `libtheora` * `libv4l` * `libvorbis` * `libxml2` * `mikmod` * `naslibs` * `ncurses-base` * `nspr` * `nss` * `openal` * `openal-soft` * `openldap` * `openmotif` * `openssl` * `pango` * `pixman` * `png` * `pulseaudio-libs` * `qt` * `qt-x11` * `qtwebkit` * `scimlibs` * `sdl12` * `sdlimage` * `sdlmixer` * `sqlite3` * `tcl85` * `tcp_wrappers-libs` * `tiff` * `tk85` * `ucl` * `xorglibs` [[uses-llvm]] == `llvm` Possible arguments: (none), `_XY_`, min=`_XY_`, max=`_XY_`, build, run, lib Adds a dependency on LLVM. By default this is a build dependency unless overridden by the `run` or `lib` options. The default version is the one set in `LLVM_DEFAULT`. A specific version can be specified as well. The minimum and maximum versions can be specified with the `min` and `max` parameters respectively. The ports framework export the following variables to the port: `LLVM_VERSION`:: Version chosen from the arguments to llvm.mk `LLVM_PORT`:: Chosen llvm port `LLVM_CONFIG`:: llvm-config of the chosen port `LLVM_LIBLLVM`:: libLLVM.so of the chosen port `LLVM_PREFIX`:: Installation prefix of the chosen port [[uses-localbase]] == `localbase` Possible arguments: (none), `ldflags` Ensures that libraries from dependencies in `LOCALBASE` are used instead of the ones from the base system. Specify `ldflags` to add `-L${LOCALBASE}/lib` to `LDFLAGS` instead of `LIBS`. Ports that depend on libraries that are also present in the base system should use this. It is also used internally by a few other `USES`. [[uses-lua]] == `lua` Possible arguments: (none), `_XY_`, `_XY_+`, `-_XY_`, `_XY_-_ZA_`, `module`, `flavors`, `build`, `run`, `env` Adds a dependency on Lua. By default this is a library dependency, unless overridden by the `build` and/or `run` option. The `env` option prevents the addition of any dependency, while still defining all the usual variables. The default version is set by the usual `DEFAULT_VERSIONS` mechanism, unless a version or range of versions is specified as an argument, for example, `51` or `51-54`. Applications using Lua are normally built for only a single Lua version. However, library modules intended to be loaded by Lua code should use the `module` option to build with multiple flavors. For more information see crossref:special[using-lua,Using Lua]. [[uses-luajit]] == `luajit` Possible arguments: (none), `_X_` Adds a dependency on luajit runtime. A specific version _X_ can be used. Possible versions are `luajit`, `luajit-devel`, `luajit-openresty` After including [.filename]#bsd.port.options.mk# or [.filename]#bsd.port.pre.mk# the port can inspect these variables: `LUAJIT_VER`:: The selected luajit version `LUAJIT_INCDIR`:: The path to luajit's header files `LUAJIT_LUAVER`:: Which luajit spec version is selected (2.0 for luajit, else 2.1) For more information see crossref:special[using-lua,Using Lua]. [[uses-lxqt]] == `lxqt` Possible arguments: (none) Handle dependencies for the LXQt Desktop Environment. Use `USE_LXQT` to select the components needed for the port. See crossref:special[using-lxqt,Using LXQt] for more information. [[uses-magick]] == `magick` Possible arguments: (none), `_X_`, `build`, `nox11`, `run`, `test` Add a library dependency on `ImageMagick`. A specific version _X_ can be used. Possible versions are `6` and `7` (default). `nox11` indicates that the `-nox11` version of the port is required. `build`, `run` and `test` add build-, run-time and test dependencies on ImageMagick. [[uses-makeinfo]] == `makeinfo` Possible arguments: (none) Add a build-time dependency on `makeinfo` if it is not present in the base system. [[uses-makeself]] == `makeself` Possible arguments: (none) Indicates that the distribution files are makeself archives and sets the appropriate dependencies. [[uses-mate]] == `mate` Possible arguments: (none) Provides an easy way to depend on MATE components. The components should be listed in `USE_MATE`. The available components are: * `autogen` * `caja` * `common` * `controlcenter` * `desktop` * `dialogs` * `docutils` * `icontheme` * `intlhack` * `intltool` * `libmatekbd` * `libmateweather` * `marco` * `menus` * `notificationdaemon` * `panel` * `pluma` * `polkit` * `session` * `settingsdaemon` The default dependency is build- and run-time, it can be changed with `:build` or `:run`. For example: [.programlisting] .... USES= mate USE_MATE= menus:build intlhack .... [[uses-meson]] == `meson` Possible arguments: (none) Provide support for Meson based projects. For more information see crossref:special[using-meson,Using `meson`]. [[uses-metaport]] == `metaport` Possible arguments: (none) Sets the following variables to make it easier to create a metaport: `MASTER_SITES`, `DISTFILES`, `EXTRACT_ONLY`, `NO_BUILD`, `NO_INSTALL`, `NO_MTREE`, `NO_ARCH`. [[uses-minizip]] == `minizip` Possible arguments: (none), `ng` Adds a library dependency on package:archivers/minizip[] or package:archivers/minizip-ng[] respectively. [[uses-mlt]] == `mlt` Possible arguments: `7`, `nodepend` Provide support for ports depending on package:multimedia/mlt7[]. If the `nodepend` argument is provided no library dependency is generated. This argument only makes sense for multimedia/mlt7* ports. [[uses-mysql]] == `mysql` Possible arguments: (none), `_version_`, `client` (default), `server`, `embedded` Provide support for MySQL If no version is given, try to find the current installed version. Fall back to the default version, MySQL-5.6. The possible versions are `55`, `55m`, `55p`, `56`, `56p`, `56w`, `57`, `57p`, `80`, `100m`, `101m`, and `102m`. The `m` and `p` suffixes are for the MariaDB and Percona variants of MySQL. `server` and `embedded` add a build- and run-time dependency on the MySQL server. When using `server` or `embedded`, add `client` to also add a dependency on [.filename]#libmysqlclient.so#. A port can set `IGNORE_WITH_MYSQL` if some versions are not supported. The framework sets `MYSQL_VER` to the detected MySQL version. [[uses-mono]] == `mono` Possible arguments: (none), `nuget` Adds a dependency on the Mono (currently only C#) framework by setting the appropriate dependencies. Specify `nuget` when the port uses nuget packages. `NUGET_DEPENDS` needs to be set with the names and versions of the nuget packages in the format `_name_=_version_`. An optional package origin can be added using `_name_=_version_:_origin_`. The helper target, `buildnuget`, will output the content of the `NUGET_DEPENDS` based on the provided [.filename]#packages.config#. [[uses-motif]] == `motif` Possible arguments: (none) Uses package:x11-toolkits/open-motif[] as a library dependency. End users can set `WANT_LESSTIF` in [.filename]#make.conf# to use package:x11-toolkits/lesstif[] as dependency instead of package:x11-toolkits/open-motif[]. Similarly setting `WANT_OPEN_MOTIF_DEVEL` in [.filename]#make.conf# will add a dependency on package:x11-toolkits/open-motif-devel[] [[uses-mpi]] == `mpi` Possible arguments: `mpich` (default), `openmpi` Provide support for ports depending on `MPI`. If the `mpich` argument is provided a dependency on package:net/mpich[] is added to the port. If the `openmpi` argument is provided a dependency on package:net/openmpi[] is added to the port. The ports framework provides the following variables that can be read by the port: `MPI_LIBS`:: Libraries needed to link programs using `MPI`. `MPI_CFLAGS`:: Compiler flags necessary to build programs using `MPI`. `MPICC`:: Location of the `mpicc` executable. Default: [.filename]#${MPI_HOME}/bin/mpicc#. `MPICXX`:: Location of the `mpicxx` executable. Default: [.filename]#${MPI_HOME}/bin/mpicxx#. `MPIF90`:: Location of the `mpif90` executable. Default: [.filename]#${MPI_HOME}/bin/mpif90#. `MPIFC`:: Same as above. `MPI_HOME`:: Installation directory of `MPI`. Defaults to `${LOCALBASE}` for `MPICH`. `MPIEXEC`:: Location of the `mpiexec` executable. Default: [.filename]#${MPI_HOME}/bin/mpiexec#. `MPIRUN`:: Location of the `mpirun` executable. Default: [.filename]#${MPI_HOME}/bin/mpirun#. [[uses-ncurses]] == `ncurses` Possible arguments: (none), `base`, `port` Uses ncurses, and causes some useful variables to be set. [[uses-nextcloud]] == `nextcloud` Possible arguments: (none) Adds support for Nextcloud applications by adding a run time dependency on package:www/nextcloud[]. [[uses-ninja]] == `ninja` Possible arguments: (none), `build`, `make` (default), `run` If `build` or `run` arguments are specify, it respectively adds a build or run time dependency on package:devel/ninja[]. If `make` or no arguments are provided, use ninja to build the port instead of make. `make` implies `build`. If the variable `NINJA_DEFAULT` is set to `samurai`, then the dependencies are set on package:devel/samurai[] instead. [[uses-nodejs]] == `nodejs` Possible arguments: (none), `build`, `run`, `current`, `lts`, `10`, `14`, `16`, `17`. Uses nodejs. Adds a dependency on package:www/node*[]. If a supported version is specified then `run` and/or `build` must be specified too. [[uses-objc]] == `objc` Possible arguments: (none) Add objective C dependencies (compiler, runtime library) if the base system does not support it. [[uses-ocaml]] == `ocaml` Possible arguments: (none), `build`,`camlp4`,`dune`,`findlib`,`findplist`,`ldconfig`,`run`,`tk`,`tkbuild`,`tkrun`,`wash` Provide support for OCaml. If no arguments are provided, it defaults to `build`, `run`. If the `build` argument is provided then package:lang/ocamlc[] is added to `BUILD_DEPENDS`, `EXTRACT` and `PATCH_DEPENDS`. If the `camlp4` argument is provided then package:devel/ocamlp4[] is used to build. If the `dune` argument is provided then package:devel/ocaml-dune[] is used as build system. If the `findlib` argument is provided then `ocamlfind` will be used to install packages. Package directories will be automatically deleted. If the `findplist` argument is provided then contents of the `findlib` target directories will be added automatically. If the `ldconfig` argument is provided then OCaml's [.filename]#ld.conf# file will be automatically processed. When `dune` is used Dune may install stublibs in site-lib package directory(ies) or in a single directory below `DUNE_LIBDIR` site-lib directory. Set if the port installs shared libraries into ocaml If the `run` argument is provided add ocamlc to `RUN_DEPENDS`. If the `tk` argument is provided then a build and run dependency on package:x11-toolkits/ocaml-labltk[] is added to the port. Implies `tkbuild` and `tkrun`. If the `tkbuild` argument is provided then package:x11-toolkits/ocaml-labltk[] is added to `BUILD_DEPENDS`, `EXTRACT` and `PATCH_DEPENDS`. If the `tkrun` argument is provided then package:x11-toolkits/ocaml-labltk[] is added to `RUN_DEPENDS`. If the `wash` argument is provided Ocaml's shared directories will be purged on uninstall. Useful when installing to non-standard `PREFIX`. The following variables can be set by the port: `OCAML_PKGDIRS`:: Directories under site-lib to be processed if the `findlib` argument is specified. Default: `${PORTNAME}` `OCAML_LDLIBS`:: Directories under `PREFIX` to be automatically added/removed from [.filename]#ld.conf#. Default: `${OCAML_SITELIBDIR}/${PORTNAME}` `OCAML_PACKAGES`:: List of packages to build and install. Default to `${PORTNAME}` [[uses-octave]] == `octave` Possible arguments: (none), env Uses package:math/octave[]. `env` loads only one `OCTAVE_VERSION` environmental variable. [[uses-openal]] == `openal` Possible arguments: `al`, `soft` (default), `si`, `alut` Uses OpenAL. The backend can be specified, with the software implementation as the default. The user can specify a preferred backend with `WANT_OPENAL`. Valid values for this knob are `soft` (default) and `si`. [[uses-pathfix]] == `pathfix` Possible arguments: (none) Look for [.filename]#Makefile.in# and [.filename]#configure# in `PATHFIX_WRKSRC` (defaults to `WRKSRC`) and fix common paths to make sure they respect the FreeBSD hierarchy. For example, it fixes the installation directory of `pkgconfig`'s [.filename]#.pc# files to [.filename]#${PREFIX}/libdata/pkgconfig#. If the port uses `USES=autoreconf`, [.filename]#Makefile.am# will be added to `PATHFIX_MAKEFILEIN` automatically. If the port crossref:uses[uses-cmake,`USES=cmake`] it will look for [.filename]#CMakeLists.txt# in `PATHFIX_WRKSRC`. If needed, that default filename can be changed with `PATHFIX_CMAKELISTSTXT`. [[uses-pear]] == `pear` Possible arguments: `env` Adds a dependency on package:devel/pear[]. It will setup default behavior for software using the PHP Extension and Application Repository. Using the `env` arguments only sets up the PEAR environment variables. See crossref:special[php-pear,PEAR Modules] for more information. [[uses-perl5]] == `perl5` Possible arguments: (none) Depends on Perl. The configuration is done using `USE_PERL5`. `USE_PERL5` can contain the phases in which to use Perl, can be `extract`, `patch`, `build`, `run`, or `test`. `USE_PERL5` can also contain `configure`, `modbuild`, or `modbuildtiny` when [.filename]#Makefile.PL#, [.filename]#Build.PL#, or Module::Build::Tiny's flavor of [.filename]#Build.PL# is required. `USE_PERL5` defaults to `build run`. When using `configure`, `modbuild`, or `modbuildtiny`, `build` and `run` are implied. See crossref:special[using-perl,Using Perl] for more information. [[uses-pgsql]] == `pgsql` Possible arguments: (none), `_X.Y_`, `_X.Y_+`, `_X.Y_-`, `_X.Y_-_Z.A_` Provide support for PostgreSQL. Port maintainer can set version required. Minimum and maximum versions or a range can be specified; for example, `9.0-`, `8.4+`, `8.4-9.2.` By default, the added dependency will be the client, but if the port requires additional components, this can be done using `WANT_PGSQL=_component[:target]_`; for example, `WANT_PGSQL=server:configure pltcl plperl`. The available components are: * `client` * `contrib` * `docs` * `pgtcl` * `plperl` * `plpython` * `pltcl` * `server` [[uses-php]] == `php` Possible arguments: (none), `phpize`, `ext`, `zend`, `build`, `cli`, `cgi`, `mod`, `web`, `embed`, `pecl`, `flavors`, `noflavors` Provide support for PHP. Add a runtime dependency on the default PHP version, package:lang/php81[]. `phpize`:: Use to build a PHP extension. Enables flavors. `ext`:: Use to build, install and register a PHP extension. Enables flavors. `zend`:: Use to build, install and register a Zend extension. Enables flavors. `build`:: Set PHP also as a build-time dependency. `cli`:: Needs the CLI version of PHP. `cgi`:: Needs the CGI version of PHP. `mod`:: Needs the Apache module for PHP. `web`:: Needs the Apache module or the CGI version of PHP. `embed`:: Needs the embedded library version of PHP. `pecl`:: Provide defaults for fetching PHP extensions from the PECL repository. Enables flavors. `flavors`:: Enable automatic crossref:flavors[flavors-auto-php,PHP flavors] generation. Flavors will be generated for all PHP versions, except the ones present in crossref:uses[uses-php-ignore,`IGNORE_WITH_PHP`]. `noflavors`:: Disable automatic PHP flavors generation. _Must only_ be used with extensions provided by PHP itself. Variables are used to specify which PHP modules are required, as well as which version of PHP are supported. `USE_PHP`:: The list of required PHP extensions at run-time. Add `:build` to the extension name to add a build-time dependency. Example: `pcre xml:build gettext` [[uses-php-ignore]] `IGNORE_WITH_PHP`:: The port does not work with PHP of the given version. For possible values look at the content of `_ALL_PHP_VERSIONS` in [.filename]#Mk/Uses/php.mk#. When building a PHP or Zend extension with `:ext` or `:zend`, these variables can be set: `PHP_MODNAME`:: The name of the PHP or Zend extension. Default value is `${PORTNAME}`. `PHP_HEADER_DIRS`:: A list of subdirectories from which to install header files. The framework will always install the header files that are present in the same directory as the extension. `PHP_MOD_PRIO`:: The priority at which to load the extension. It is a number between `00` and `99`. + For extensions that do not depend on any extension, the priority is automatically set to `20`, for extensions that depend on another extension, the priority is automatically set to `30`. Some extensions may need to be loaded before every other extension, for example package:www/php56-opcache[]. Some may need to be loaded after an extension with a priority of `30`. In that case, add `PHP_MOD_PRIO=_XX_` in the port's Makefile. For example: + [.programlisting] .... USES= php:ext USE_PHP= wddx PHP_MOD_PRIO= 40 .... These variables are available to use in `PKGNAMEPREFIX` or `PKGNAMESUFFIX`: `PHP_PKGNAMEPREFIX`:: Contains `php_XY_-` where _XY_ is the current flavor's PHP version. Use with PHP extensions and modules. `PHP_PKGNAMESUFFIX`:: Contains `-php_XY_` where _XY_ is the current flavor's PHP version. Use with PHP applications. `PECL_PKGNAMEPREFIX`:: Contains `php_XY_-pecl-` where _XY_ is the current flavor's PHP version. Use with PECL modules. [IMPORTANT] ==== With flavors, all PHP extensions, PECL extensions, PEAR modules _must have_ a different package name, so they must all use one of these three variables in their `PKGNAMEPREFIX` or `PKGNAMESUFFIX`. ==== [[uses-pkgconfig]] == `pkgconfig` Possible arguments: (none), `build` (default), `run`, `both` Uses package:devel/pkgconf[]. With no arguments or with the `build` argument, it implies `pkg-config` as a build-time dependency. `run` implies a run-time dependency and `both` implies both run-time and build-time dependencies. [[uses-pure]] == `pure` Possible arguments: (none), `ffi` Uses package:lang/pure[]. Largely used for building related pure ports. With the `ffi` argument, it implies package:devel/pure-ffi[] as a run-time dependency. [[uses-pyqt]] == `pyqt` Possible arguments: (none), `4`, `5` Uses PyQt. If the port is part of PyQT itself, set `PYQT_DIST`. Use `USE_PYQT` to select the components the port needs. The available components are: * `core` * `dbus` * `dbussupport` * `demo` * `designer` * `designerplugin` * `doc` * `gui` * `multimedia` * `network` * `opengl` * `qscintilla2` * `sip` * `sql` * `svg` * `test` * `webkit` * `xml` * `xmlpatterns` These components are only available with PyQT4: * `assistant` * `declarative` * `help` * `phonon` * `script` * `scripttools` These components are only available with PyQT5: * `multimediawidgets` * `printsupport` * `qml` * `serialport` * `webkitwidgets` * `widgets` The default dependency for each component is build- and run-time, to select only build or run, add `_build` or `_run` to the component name. For example: [.programlisting] .... USES= pyqt USE_PYQT= core doc_build designer_run .... [[uses-pytest]] == `pytest` Possible arguments: (none), 4 Introduces a new dependency on package:devel/pytest[]. It defines a `do-test` target which will run the tests properly. Use the argument to depend on a specific package:devel/pytest[] version. For ports using package:devel/pytest[] consider using this instead of a specific `do-test` target. The framework exposes the following variables to the port: `PYTEST_ARGS`:: Additional arguments to pytest (defaults to empty). `PYTEST_IGNORED_TESTS`:: lists of `pytest -k` patterns of tests to ignore (defaults to empty). For tests which are not expected to pass, such as ones requiring a database access. `PYTEST_BROKEN_TESTS`:: lists of `pytest -k` patterns of tests to ignore (defaults to empty). For broken tests which require fixing. In addition the following variables may be set by the user: `PYTEST_ENABLE_IGNORED_TESTS`:: Enable tests which are otherwise ignored by `PYTEST_IGNORED_TESTS`. `PYTEST_ENABLE_BROKEN_TESTS`:: Enable tests which are otherwise ignored by `PYTEST_BROKEN_TESTS`. `PYTEST_ENABLE_ALL_TESTS`:: Enable tests which are otherwise ignored by `PYTEST_IGNORED_TESTS` and `PYTEST_BROKEN_TESTS`. [[uses-python]] == `python` Possible arguments: (none), `_X.Y_`, `_X.Y+_`, `_-X.Y_`, `_X.Y-Z.A_`, `patch`, `build`, `run`, `test` Uses Python. A supported version or version range can be specified. If Python is only needed at build time, run time or for the tests, it can be set as a build, run or test dependency with `build`, `run`, or `test`. If Python is also needed during the patch phase, use `patch`. See crossref:special[using-python, Using Python] for more information. `USES=python:env` can be used when the variables exported by the framework are needed but a dependency on Python is not. It can happen when using with crossref:uses[uses-shebangfix,`USES=shebangfix`], and the goal is only to fix the shebangs but not add a dependency on Python. [[uses-qmail]] == `qmail` Possible arguments: (none), `build`, `run`, `both`, `vars` Uses package:mail/qmail[]. With the `build` argument, it implies `qmail` as a build-time dependency. `run` implies a run-time dependency. Using no argument or the `both` argument implies both run-time and build-time dependencies. `vars` will only set QMAIL variables for the port to use. [[uses-qmake]] == `qmake` Possible arguments: (none), `norecursive`, `outsource`, `no_env`, `no_configure` Uses QMake for configuring. For more information see crossref:special[using-qmake,Using `qmake`]. [[uses-qt]] == `qt` Possible arguments: `5`, `6`, `no_env` Add dependency on Qt components. `no_env` is passed directly to `USES= qmake`. See crossref:special[using-qt,Using Qt] for more information. [[uses-qt-dist]] == `qt-dist` Possible arguments: (none) or `5` and (none) or `6` and (none) or one of `3d`, `5compat`, `base`, `charts`, `connectivity`, `datavis3d`, `declarative`, `doc` `languageserver`, `gamepad`, `graphicaleffects`, `imageformats`, `locat ion`, `lottie`, `multimedia`, `networkauth`, `positioning`, `quick3d`, `quickcontrols2`, `quickcontrols`, `quicktimeline`, `remoteobjects`, `script`, `scxml `, `sensors`, `serialbus`, `serialport`, `shadertools`, `speech`, `svg`, `tools`, `translations`, `virtualkeyboard`, `wayland`, `webchannel`, `webengine`, `webglplugin`, `websockets`, `webview`, `x11extras`, `xmlpatterns`. Provides support for building Qt 5 and Qt 6 components. It takes care of setting up the appropriate configuration environment for the port to build. [[qt5-dist-example]] .Building Qt 5 Components [example] ==== The port is Qt 5's `networkauth` component, which is part of the `networkauth` distribution file. [.programlisting] .... PORTNAME= networkauth DISTVERSION= ${QT5_VERSION} USES= qt-dist:5 .... ==== [[qt6-dist-example]] .Building Qt 6 Components [example] ==== The port is Qt 6's `websockets` component, which is part of the `websockets` distribution file. [.programlisting] .... PORTNAME= websockets PORTVERSION= ${QT6_VERSION} USES= qt-dist:6 .... ==== If `PORTNAME` does not match the component name, it can be passed as an argument to `qt-dist`. [[qt5-dist-example-explicit]] .Building Qt 5 Components with Different Names [example] ==== The port is Qt 5's `gui` component, which is part of the `base` distribution file. [.programlisting] .... PORTNAME= gui DISTVERSION= ${QT5_VERSION} USES= qt-dist:5,base .... ==== [[uses-readline]] == `readline` Possible arguments: (none), `port` Uses readline as a library dependency, and sets `CPPFLAGS` and `LDFLAGS` as necessary. If the `port` argument is used or if readline is not present in the base system, add a dependency on package:devel/readline[] [[uses-ruby]] == `ruby` Possible arguments: (none), `build`, `extconf`, `run`, `setup` Provide support for Ruby related ports. `(none)` without arguments adds runtime dependency on package:lang/ruby[]. `build` adds a dependency on package:lang/ruby[] at build time. `extconf` states that the port uses extconf.rb to configure. `run` adds a dependency on package:lang/ruby[] at run time. This is also the default. `setup` states that the port uses setup.rb to configure and build. The user may have the following variables defined: `RUBY_VER`:: Alternative short version of ruby in the form of `x.y'. `RUBY_DEFAULT_VER`:: Set to (e.g.) `2.7` to use `ruby27` as the default version. `RUBY_ARCH`:: Set the architecture name (e.g. i386-freebsd7). The following variables are exported to be used by the port: `RUBY`:: Set to full path of ruby. If set, the values of the following variables are automatically obtained from the ruby executable: `RUBY_ARCH`, `RUBY_ARCHLIBDIR`, `RUBY_LIBDIR`, `RUBY_SITEARCHLIBDIR`, `RUBY_SITELIBDIR`, `RUBY_VER` and `RUBY_VERSION` `RUBY_VER`:: Set to the alternative short version of ruby in the form of `x.y'. `RUBY_EXTCONF`:: Set to the alternative name of extconf.rb (default: extconf.rb). `RUBY_EXTCONF_SUBDIRS`:: Set to list of subdirectories, if multiple modules are included. `RUBY_SETUP`:: Set to the alternative name of setup.rb (default: setup.rb). [[uses-samba]] == `samba` Possible arguments: `build`, `env`, `lib`, `run` Handle dependency on Samba. `env` will not add any dependency and only set up the variables. `build` and `run` will add build-time and run-time dependency on [.filename]#smbd#. `lib` will add a dependency on [.filename]#libsmbclient.so#. The variables that are exported are: `SAMBA_PORT`:: The origin of the default Samba port. `SAMBA_INCLUDEDIR`:: The location of the Samba header files. `SAMBA_LIBS`:: The directory where the Samba shared libraries are available. `SAMBA_LDB_PORT`:: The origin of the ldb port used by the selected Samba version (e.g., package:databases/ldb28[]). It should be used if a port needs to depend on the same ldb version as the selected Samba version. `SAMBA_TALLOC_PORT`:: The origin of the talloc port used by the selected Samba version. It should be used if a port needs to depend on the same talloc version as the selected Samba version. `SAMBA_TDB_PORT`:: The origin of the TDB port used by the selected Samba version. It should be used if a port needs to depend on the same TDB version as the selected Samba version. `SAMBA_TEVENT_PORT`:: The origin of the tevent port used by the selected Samba version. It should be used if a port needs to depend on the same tevent version as the selected Samba version. [[uses-scons]] == `scons` Possible arguments: (none) Provide support for the use of package:devel/scons[]. See crossref:special[using-scons,Using `scons`] for more information. [[uses-sdl]] == `sdl` Possible arguments: `sdl` Provide support for the use of `SDL` packages. The variable `USE_SDL` is mandatory and specifies which components to add as dependencies. The current supported `SDL1.2` modules are: * sdl * console * gfx * image * mixer * mm * net * pango * sound * ttf The current supported `SDL2` modules are: * sdl2 * gfx2 * image2 * mixer2 * net2 * sound2 * ttf2 The current supported `SDL3` modules are: * sdl3 * image3 * ttf3 [[uses-shared-mime-info]] == `shared-mime-info` Possible arguments: (none) Uses update-mime-database from package:misc/shared-mime-info[]. This uses will automatically add a post-install step in such a way that the port itself still can specify there own post-install step if needed. It also add an crossref:plist[plist-keywords-shared-mime-info,`@shared-mime-info`] entry to the plist. [[uses-shebangfix]] == `shebangfix` Possible arguments: (none) A lot of software uses incorrect locations for script interpreters, most notably [.filename]#/usr/bin/perl# and [.filename]#/bin/bash#. The shebangfix macro fixes shebang lines in scripts listed in `SHEBANG_REGEX`, `SHEBANG_GLOB`, or `SHEBANG_FILES`. `SHEBANG_REGEX`:: Contains _one_ extended regular expressions, and is used with the `-iregex` argument of man:find[1]. See crossref:uses[uses-shebangfix-ex-regex,`USESshebangfix` with `SHEBANG_REGEX`]. `SHEBANG_GLOB`:: Contains a list of patterns used with the `-name` argument of man:find[1]. See crossref:uses[uses-shebangfix-ex-glob,`USESshebangfix` with `SHEBANG_GLOB`]. `SHEBANG_FILES`:: Contains a list of files or man:sh[1] globs. The shebangfix macro is run from `${WRKSRC}`, so `SHEBANG_FILES` can contain paths that are relative to `${WRKSRC}`. It can also deal with absolute paths if files outside of `${WRKSRC}` require patching. See crossref:uses[uses-shebangfix-ex-files,`USESshebangfix` with `SHEBANG_FILES`]. Currently Bash, Java, Ksh, Lua, Perl, PHP, Python, Ruby, Tcl, and Tk are supported by default. There are three configuration variables: `SHEBANG_LANG`:: The list of supported interpreters. `_interp__CMD`:: The path to the command interpreter on FreeBSD. The default value is `${LOCALBASE}/bin/_interp_`. `_interp__OLD_CMD`:: The list of wrong invocations of interpreters. These are typically obsolete paths, or paths used on other operating systems that are incorrect on FreeBSD. They will be replaced by the correct path in `_interp__CMD`. + [NOTE] ==== These will _always_ be part of `_interp__OLD_CMD`: `"/usr/bin/env _interp_" /bin/_interp_ /usr/bin/_interp_ /usr/local/bin/_interp_`. ==== + [TIP] ==== `_interp__OLD_CMD` contain multiple values. Any entry with spaces must be quoted. See crossref:uses[uses-shebangfix-ex-ksh,Specifying all the Paths When Adding an Interpreter to `USESshebangfix`]. ==== [IMPORTANT] ==== The fixing of shebangs is done during the `patch` phase. If scripts are created with incorrect shebangs during the `build` phase, the build process (for example, the [.filename]#configure# script, or the [.filename]#Makefiles#) must be patched or given the right path (for example, with `CONFIGURE_ENV`, `CONFIGURE_ARGS`, `MAKE_ENV`, or `MAKE_ARGS`) to generate the right shebangs. Correct paths for supported interpreters are available in `_interp__CMD`. ==== [TIP] ==== When used with crossref:uses[uses-python,`USES=python`], and the aim is only to fix the shebangs but a dependency on Python itself is not wanted, use `USES=python:env` instead. ==== [[uses-shebangfix-ex-lua]] .Adding Another Interpreter to `USES=shebangfix` [example] ==== To add another interpreter, set `SHEBANG_LANG`. For example: [.programlisting] .... SHEBANG_LANG= lua .... ==== [[uses-shebangfix-ex-ksh]] .Specifying all the Paths When Adding an Interpreter to `USES=shebangfix` [example] ==== If it was not already defined, and there were no default values for `_interp__OLD_CMD` and `_interp__CMD` the Ksh entry could be defined as: [.programlisting] .... SHEBANG_LANG= ksh ksh_OLD_CMD= "/usr/bin/env ksh" /bin/ksh /usr/bin/ksh ksh_CMD= ${LOCALBASE}/bin/ksh .... ==== [[uses-shebangfix-ex-strange]] .Adding a Strange Location for an Interpreter [example] ==== Some software uses strange locations for an interpreter. For example, an application might expect Python to be located in [.filename]#/opt/bin/python2.7#. The strange path to be replaced can be declared in the port [.filename]#Makefile#: [.programlisting] .... python_OLD_CMD= /opt/bin/python2.7 .... ==== [[uses-shebangfix-ex-regex]] .`USES=shebangfix` with `SHEBANG_REGEX` [example] ==== To fix all the files in `${WRKSRC}/scripts` ending in [.filename]#.pl#, [.filename]#.sh#, or [.filename]#.cgi# do: [.programlisting] .... USES= shebangfix SHEBANG_REGEX= ./scripts/.*\.(sh|pl|cgi) .... [NOTE] ====== `SHEBANG_REGEX` is used by running `find -E`, which uses modern regular expressions also known as extended regular expressions. See man:re_format[7] for more information. ====== ==== [[uses-shebangfix-ex-glob]] .`USES=shebangfix` with `SHEBANG_GLOB` [example] ==== To fix all the files in `${WRKSRC}` ending in [.filename]#.pl# or [.filename]#.sh#, do: [.programlisting] .... USES= shebangfix SHEBANG_GLOB= *.sh *.pl .... ==== [[uses-shebangfix-ex-files]] .`USES=shebangfix` with `SHEBANG_FILES` [example] ==== To fix the files [.filename]#script/foobar.pl# and [.filename]#script/*.sh# in `${WRKSRC}`, do: [.programlisting] .... USES= shebangfix SHEBANG_FILES= scripts/foobar.pl scripts/*.sh .... ==== [[uses-sqlite]] == `sqlite` Possible arguments: (none), `2`, `3` Add a dependency on SQLite. The default version used is 3, but version 2 is also possible using the `:2` modifier. [[uses-sbrk]] == `sbrk` Possible arguments: (none) Marks the port as `BROKEN` in `aarch64` and `riscv64`. [[uses-ssl]] == `ssl` Possible arguments: (none), `build`, `run` Provide support for OpenSSL. A build- or run-time only dependency can be specified using `build` or `run`. These variables are available for the port's use, they are also added to `MAKE_ENV`: `OPENSSLBASE`:: Path to the OpenSSL installation base. `OPENSSLDIR`:: Path to OpenSSL's configuration files. `OPENSSLLIB`:: Path to the OpenSSL libraries. `OPENSSLINC`:: Path to the OpenSSL includes. `OPENSSLRPATH`:: If defined, the path the linker needs to use to find the OpenSSL libraries. [TIP] ==== If a port does not build with an OpenSSL flavor, set the `BROKEN_SSL` variable, and possibly the `BROKEN_SSL_REASON__flavor_`: [.programlisting] .... BROKEN_SSL= libressl BROKEN_SSL_REASON_libressl= needs features only available in OpenSSL .... ==== [[uses-tar]] == `tar` Possible arguments: (none), `Z`, `bz2`, `bzip2`, `lzma`, `tbz`, `tbz2`, `tgz`, `txz`, `xz`, `zst`, `zstd` Set `EXTRACT_SUFX` to `.tar`, `.tar.Z`, `.tar.bz2`, `.tar.bz2`, `.tar.lzma`, `.tbz`, `.tbz2`, `.tgz`, `.txz`, `.tar.xz`, `.tar.zst` or `.tar.zstd` respectively. [[uses-tcl]] == `tcl` Possible arguments: _version_, `wrapper`, `build`, `run`, `tea` Add a dependency on Tcl. A specific version can be requested using _version_. The version can be empty, one or more exact version numbers (currently `84`, `85`, or `86`), or a minimal version number (currently `84+`, `85+` or `86+`). To only request a non version specific wrapper, use `wrapper`. A build- or run-time only dependency can be specified using `build` or `run`. To build the port using the Tcl Extension Architecture, use `tea`. After including [.filename]#bsd.port.pre.mk# the port can inspect the results using these variables: * `TCL_VER`: chosen major.minor version of Tcl * `TCLSH`: full path of the Tcl interpreter * `TCL_LIBDIR`: path of the Tcl libraries * `TCL_INCLUDEDIR`: path of the Tcl C header files * `TCL_PKG_LIB_PREFIX`: Library prefix, as per TIP595 * `TCL_PKG_STUB_POSTFIX`: Stub library postfix * `TK_VER`: chosen major.minor version of Tk * `WISH`: full path of the Tk interpreter * `TK_LIBDIR`: path of the Tk libraries * `TK_INCLUDEDIR`: path of the Tk C header files [[uses-terminfo]] == `terminfo` Possible arguments: (none) Adds crossref:plist[plist-keywords-terminfo,`@terminfo`] to the [.filename]#plist#. Use when the port installs [.filename]#*.terminfo# files in [.filename]#${PREFIX}/share/misc#. [[uses-tex]] == `tex` Possible arguments: (none) Provide support for tex. Loads all the default variables for TEX related ports and does not add any dependency on any ports. Variables are used to specify which TEX modules are required. `USE_TEX`:: The list of required TEX extensions at run-time. Add `:build` to the extension name to add a build-time dependency, `:run` to add runtime dependency, `:test` for test time dependency, `:extract` for extract time dependency. Example: `base texmf:build source:run` Current possible arguments are as follows: * `base` * `texmf` * `source` * `docs` * `web2c` * `kpathsea` * `ptexenc` * `basic` * `tlmgr` * `texlua` * `texluajit` * `synctex` * `xpdfopen` * `dvipsk` * `dvipdfmx` * `xdvik` * `gbklatex` * `formats` * `tex` * `latex` * `pdftex` * `jadetex` * `luatex` * `ptex` * `xetex` * `xmltex` * `texhash` * `updmap` * `fmtutil` [[uses-tk]] == `tk` Same as arguments for `tcl` Small wrapper when using both Tcl and Tk. The same variables are returned as when using Tcl. [[uses-trigger]] == `trigger` Possible arguments: (none) Provide support for ports requiring triggers to be executed by man:pkg[8]. Triggers are executed at the end of a transaction if the conditions are met. The following variable can be set by ports: `TRIGGERS`:: List of triggers to package. Defaults to `${PORTNAME}`. Triggers are specified in UCL format and are usually placed in the [.filename]#files/# directory of the port. [[uses-uidfix]] == `uidfix` Possible arguments: (none) Changes some default behavior (mostly variables) of the build system to allow installing this port as a normal user. Try this in the port before using crossref:uses[uses-fakeroot,`USES=fakeroot`] or patching. [[uses-uniquefiles]] == `uniquefiles` Possible arguments: (none), `dirs` Make files or directories 'unique', by adding a prefix or suffix. If the `dirs` argument is used, the port needs a prefix (and only a prefix) based on `UNIQUE_PREFIX` for standard directories `DOCSDIR`, `EXAMPLESDIR`, `DATADIR`, `WWWDIR`, `ETCDIR`. These variables are available for ports: * `UNIQUE_PREFIX`: The prefix to be used for directories and files. Default: `${PKGNAMEPREFIX}`. * `UNIQUE_PREFIX_FILES`: A list of files that need to be prefixed. Default: empty. * `UNIQUE_SUFFIX`: The suffix to be used for files. Default: `${PKGNAMESUFFIX}`. * `UNIQUE_SUFFIX_FILES`: A list of files that need to be suffixed. Default: empty. [[uses-vala]] == `vala` Possible arguments: `build`, `lib`, `no_depend` Adds build or library dependencies on package:lang/vala[]. The `no_depend` argument is reserved for package:lang/vala[] itself. [[uses-varnish]] == `varnish` Possible arguments: `4` (default), `6`, `7` Handle dependencies on Varnish Cache. Adds a dependency on package:www/varnish*[]. [[uses-waf]] == `waf` Possible arguments: (none) Provide support for ports using the `waf` build system. It implies `USES=python:build`. The following variables are exported to be used by the port: `WAF_CMD`:: Location of the `waf` script. Set this if the `waf` script is not in [.filename]#WRKSRC/waf#. `CONFIGURE_TARGET`:: `Configure` target. Default `configure`. `ALL_TARGET`:: `All` target. Default `build`. `INSTALL_TARGET`:: `Install` target. Default `install`. [[uses-webplugin]] == `webplugin` Possible arguments: (none), `ARGS` Automatically create and remove symbolic links for each application that supports the webplugin framework. `ARGS` can be one of: * `gecko`: support plug-ins based on Gecko * `native`: support plug-ins for Gecko, Opera, and WebKit-GTK * `linux`: support Linux plug-ins * `all` (default, implicit): support all plug-in types * (individual entries): support only the browsers listed These variables can be adjusted: * `WEBPLUGIN_FILES`: No default, must be set manually. The plug-in files to install. * `WEBPLUGIN_DIR`: The directory to install the plug-in files to, default [.filename]#PREFIX/lib/browser_plugins/WEBPLUGIN_NAME#. Set this if the port installs plug-in files outside of the default directory to prevent broken symbolic links. * `WEBPLUGIN_NAME`: The final directory to install the plug-in files into, default `PKGBASE`. [[uses-xfce]] == `xfce` Possible arguments: (none), `gtk2` Provide support for Xfce related ports. See crossref:special[using-xfce,Using Xfce] for details. The `gtk2` argument specifies that the port requires GTK2 support. It adds additional features provided by some core components, for example, package:x11/libxfce4menu[] and package:x11-wm/xfce4-panel[]. [[uses-xorg]] == `xorg` Possible arguments: (none) Provides an easy way to depend on X.org components. The components should be listed in `USE_XORG`. The available components are: [[using-x11-components]] .Available X.Org Components [cols="1,1", frame="none", options="header"] |=== | Name | Description |`dmx` |DMX extension library |`fontenc` |The fontenc Library |`fontutil` |Create an index of X font files in a directory |`ice` |Inter Client Exchange library for X11 |`libfs` |The FS library |`pciaccess` |Generic PCI access library |`pixman` |Low-level pixel manipulation library |`sm` |Session Management library for X11 |`x11` |X11 library |`xau` |Authentication Protocol library for X11 |`xaw` |X Athena Widgets library |`xaw6` |X Athena Widgets library |`xaw7` |X Athena Widgets library |`xbitmaps` |X.Org bitmaps data |`xcb` |The X protocol C-language Binding (XCB) library |`xcomposite` |X Composite extension library |`xcursor` |X client-side cursor loading library |`xdamage` |X Damage extension library |`xdmcp` |X Display Manager Control Protocol library |`xext` |X11 Extension library |`xfixes` |X Fixes extension library |`xfont` |X font library |`xfont2` |X font library |`xft` |Client-sided font API for X applications |`xi` |X Input extension library |`xinerama` |X11 Xinerama library |`xkbfile` |XKB file library |`xmu` |X Miscellaneous Utilities libraries |`xmuu` |X Miscellaneous Utilities libraries |`xorg-macros` |X.Org development aclocal macros |`xorg-server` |X.Org X server and related programs |`xorgproto` |xorg protocol headers |`xpm` |X Pixmap library |`xpresent` |X Present Extension library |`xrandr` |X Resize and Rotate extension library |`xrender` |X Render extension library |`xres` |X Resource usage library |`xscrnsaver` |The XScrnSaver library |`xshmfence` |Shared memory 'SyncFence' synchronization primitive |`xt` |X Toolkit library |`xtrans` |Abstract network code for X |`xtst` |X Test extension |`xv` |X Video Extension library |`xvmc` |X Video Extension Motion Compensation library |`xxf86dga` |X DGA Extension |`xxf86vm` |X Vidmode Extension |=== [[uses-xorg-cat]] == `xorg-cat` Possible arguments: `app`, `data`, `doc`, `driver`, `font`, `lib`, `proto`, `util`, `xserver` and (none) or one off `autotools` (default), `meson` Provide support for building Xorg components. It takes care of setting up common dependencies and an appropriate configuration environment needed. This is intended only for Xorg components. The category has to match upstream categories. The second argument is the build system to use. autotools is the default, but meson is also supported. [[uses-zip]] == `zip` Possible arguments: (none), `infozip` Indicates that the distribution files use the ZIP compression algorithm. For files using the InfoZip algorithm the `infozip` argument must be passed to set the appropriate dependencies.