<?xml version="1.0"?>
<!DOCTYPE webpage
  PUBLIC "-//NetBSD//DTD Website-based NetBSD Extension//EN"
        "http://www.NetBSD.org/XML/htdocs/lang/share/xml/website-netbsd.dtd">

<webpage id="docs-kernel-programming">
  <config param="desc" value="NetBSD Documentation: Kernel Programming FAQ"/>
  <config param="cvstag" 
    value="$NetBSD: programming.xml,v 1.2 2008/05/31 10:43:42 tsutsui Exp $"/>
  <config param="rcsdate" value="$Date: 2008/05/31 10:43:42 $"/>
  <head>

    <!-- Copyright (c) 1994-2005
    The NetBSD Foundation, Inc.  ALL RIGHTS RESERVED. -->
    
    <title>NetBSD Documentation: Kernel Programming FAQ</title>
  </head>


  <sect1 role="toc">
    <sect2 id="misc">
      <title>Misc</title>

      <sect3 id="knf">
	<title>What is KNF</title>

	<para>KNF stands for "Kernel Normal Form" - it's a C coding
	  style documented in
	  <filename>/usr/share/misc/style</filename>, which is
	  included in the source tree as <filename role="cvsweb">src/share/misc/style</filename>.</para>
      </sect3>

      <sect3 id="packed-attribute">
	<title>Using the `packed' attribute</title>
	
	<para>Always use the <code>`packed'</code> attribute in
	  structures which describe wire protocol data formats.</para>
      </sect3>

      <sect3 id="printf">
	<title>Using <code>printf()</code> for debugging</title>

	<para>Probably the simplest way of generating debugging
	  information from a kernel driver is to use
	  <code>printf()</code>. The kernel printf will send output to
	  the console, so beware of generating too much output and
	  making the system unusable.</para>
      </sect3>

      <sect3 id="forcing-ddb">
	<title>Forcing code to enter DDB</title>

	<para>Ensure your kernel config file contains
	  '<code>options DDB</code>', the file has
	  '<code>#include "opt_ddb.h"</code>', then use
	  '<code>Debugger()</code>'.</para>
      </sect3>

      <sect3 id="adding_a_new_driver">
	<title>Adding a new driver to the kernel</title>

	<para>Every driver needs at least:
	  <itemizedlist>
	    <listitem><code><emphasis>xxx</emphasis>probe()</code> (
	      during which NetBSD will attempt to determine if the
	      device is present)</listitem>
	    <listitem><code><emphasis>xxx</emphasis>attach()</code>
	      routine which will configure and attach the
	      device.</listitem>
	  </itemizedlist>
	</para>

	<para>Once probe and attach routines have been written, add
	  an entry to 
	  <filename>/usr/src/sys/arch/&lt;your-arch&gt;/&lt;your-arch&gt;/conf.c</filename>.</para>
	
	<para>There are two tables:
	  <itemizedlist>
	    <listitem><code>cdevsw</code> for character devices.</listitem>
	    <listitem><code>bdevsw</code> for block devices (for those
	      that also perform "block" I/O and use a strategy
	      routine).</listitem>
	  </itemizedlist></para>

	<para>Most entries will be of the form
	  <code>cdev_<emphasis>xxx</emphasis>_init()</code>, which
	  is a macro handling prototyping of the standard Unix
	  device switch routines.</para>

	<para>The probe/attach routines are called at boot time.  The
	  <code>open()</code>, <code>close()</code>,
	  <code>read()</code>, and <code>write()</code> routines are
	  called when you open up the device special file who's major
	  number corresponds to the index into that table.  For
	  example, if you open up a device who's major number is 18,
	  the "open" routine for device number 18 in
	  <code>cdevsw[]/bdevsw</code> will be called.</para>

	<para>Most drivers are split between bus specific attach code, and
	  a machine independent core. As an example, the driver for
	  the PCI lance ethernet chip has entries in the following files:

	  <itemizedlist>
	    <listitem><filename
		role="cvsweb">src/sys/dev/pci/files.pci</filename>
	      - attach information (look for 'le at pci').</listitem>
	    <listitem><filename role="cvsweb">src/sys/dev/pci/if_le_pci.c</filename> 
	      - PCI bus attach code for the driver.</listitem>
	  </itemizedlist>
	</para>
	
	<para>
	  <itemizedlist>
	    <listitem><filename role="cvsweb">src/sys/conf/files</filename>
	      - MI core attach information (look for 'le:').</listitem>
	    <listitem><filename role="cvsweb">src/sys/dev/ic/am7990.c</filename>
	      - MI driver 24bit access code.</listitem>
	    <listitem><filename role="cvsweb">src/sys/dev/ic/am79900.c</filename>
	      - MI driver 32bit access code.</listitem>
	  <listitem><filename role="cvsweb">src/sys/dev/ic/lance.c</filename>
	      - MI core driver code.</listitem>
	  </itemizedlist>
	</para>

	<para>See also <ulink 
	    url="#autoconf">the autoconf explanation</ulink>.</para>
      </sect3>

      <sect3 id="autoconf">
	<title>How does all this autoconf stuff work?</title>
	
	<para>The autoconf machinery is quite simple once you figure
	  out the way it works. If you want to ignore the exact
	  details of how the device probe tree is built and walked
	    on runtime, the bits needed for each individual
	  <quote>leaf</quote> driver are like this:
	  
	  <orderedlist>
	    <listitem>each driver specifies a structure holding
	      three things - size of its private structure, probe
	      function and attach function; this is compiled in and
	      used in runtime - example:
	      <programlisting>
struct cfattach foo_baz_ca = {
    sizeof(struct foo_baz_softc), foo_baz_match, foo_baz_attach
};</programlisting></listitem>
	    <listitem>on kernel startup, once the time comes to
	      attach the device, autoconf code calls device's
	      probe routine and passes it pointer to parent
	      (<code>struct device *parent</code>), pointer
	      to attach tag structure (<code>void *aux</code>),
	      and appropriate   autoconf node (<code>struct cfdata
		*cf</code>). The driver is expected to find out if
	      it's where it's supposed to be (commonly, the location
	      and configuration information is passed by the attach
	      tag). If yes, the probe routine should return 1. If
	      device is not there, probe routine has to return 0.
	      <emphasis role="bold">NO STATE SHOULD BE KEPT</emphasis>
	      in either case.</listitem>
	    
	    <listitem>if probe returned success, autoconf allocates
	      chunk of memory sized as specified in device's *_ca
	      and calls its attach routine, passing it pointer to
	      parent (<code>struct device *parent</code>), 
	      pointer to the freshly allocated memory
	      (<code>struct device *self</code>) and the attach tag
	      (<code>void *aux</code>). Driver is expected to find
	      out exact ports and memory, allocate resources and
	      initialize its internal structure
	      accordingly. Preferably, all driver instance specific
	      information should be kept in the allocated
	      memory.</listitem>
	  </orderedlist>
	</para>
	
	<para>Example: Let's have a PCI ethernet device 'baz',
	  kernel config chunk looks like this:
	  <programlisting>
pci*    at mainbus?
baz*    at pci? dev ? function ?</programlisting>

	  At runtime, autoconf iterates over all physical devices
	  present on machine's PCI bus. For each physical device, it
	  iterates over all devices registered in kernel to be on
	  pci bus, and calls drivers' probe routine. If any probe
	  routine claims the device by returning 1, autoconf stops
	  iterating and does the job described under 3). Once the
	  attach function returns, autoconf continues with next
	  physical device.</para>
	
	<para>See also <ulink 
	    url="#adding_a_new_driver">Adding a new driver</ulink>.</para>
      </sect3>
      
      <sect3 id="adding_a_system_call">
	<title>Adding a system call</title>
	
	<para>Add an entry in <code>syscalls.master</code>, and add
	  the syscall stub to the appropriate place in 
	  <code><filename role="cvsweb">src/lib/libc/sys/Makefile.inc</filename></code>.</para>
	<para>See the <ulink
	    url="../internals/en/chap-processes.html#syscall_howto">
            HOWTO</ulink> and related documentation in the <ulink
	    url="../internals/en/">
	    NetBSD Internals Guide</ulink> for more information.
	</para>
      </sect3>
	
      <sect3 id="adding_a_sysctl">
	<title>Adding a sysctl</title>
	
	<para>See a <ulink
	    url="http://mail-index.NetBSD.org/tech-kern/2001/06/24/0000.html">posting</ulink>
	  answering this question on  <ulink 
	    url="http://mail-index.NetBSD.org/tech-kern/">tech-kern</ulink>.</para>
	
	<para>Note that NetBSD 1.6 and up has a special
	  <quote>vendor</quote> sysctl category that is reserved for vendor
	  specific entries. See &man.sysctl.8; for more
	  information.</para>
      </sect3>
      
      <sect3 id="mmap_in_pseudo-device">
	<title>How to implement &man.mmap.2; in a pseudo-device</title>
	
	<para>Your device is most likely a character device, so
	  you will be using the device pager (the VM system hides
	  all of this from you, don't worry).</para>
	
	<para>The first thing you need to do is pick some
	  arbitrary offsets for your mmap interface.  Something
	  like "mmap offset 0-M gives object A, N-O gives object
	  B", etc.</para>
	
	<para>After that, your mmap routine would look something
	  like this:
	  <programlisting>
int
foommap(dev_t dev, int off, int prot)
{

        if (off &amp; PAGE_MASK)
                panic("foommap");

        if ((u_int)off &gt;= FOO_REGION1_MMAP_OFFSET &amp;&amp;
            (u_int)off &lt; (FOO_REGION1_MMAP_OFFSET + FOO_REGION1_SIZE))
                return (atop(FOO_REGION1_ADDR + ((u_int)off -
                    FOO_REGION1_MMAP_OFFSET)));

        if ((u_int)off &gt;= FOO_REGION2_MMAP_OFFSET &amp;&amp;
            (u_int)off &lt; (FOO_REGION2_MMAP_OFFSET + FOO_REGION2_SIZE))
                return (atop(FOO_REGION1_ADDR + ((u_int)off -
                    FOO_REGION2_MMAP_OFFSET)));

        /* Page not found. */
        return (-1);
}</programlisting></para>

	  <para>Now, this is slightly more complicated by the fact
	    that you are going to be mmap'ing what are simply
	    kernel memory objects (it is a pseudo-device after
	    all).</para>
	  
	  <para>In order to make this work, you're going to want
	    to make sure you allocate the memory objects to be
	    mmap'd on page-aligned boundaries.  If you are
	    allocating something &gt;= <code>PAGE_SIZE</code> in
	    size, this is guaranteed. Otherwise, you are going to
	    have to use <code>uvm_km_alloc()</code>, and round
	    your allocation size up to page size.</para>
	  
	  <para>Then it would look a bit more like this:
	    <programlisting>
int
foommap(dev_t dev, int off, int prot)
{
        paddr_t pa;

        if (off &amp; PAGE_MASK)
                panic("foommap: offset not page aligned");

        if ((u_int)off &gt;= FOO_REGION1_MMAP_OFFSET &amp;&amp;
            (u_int)off &lt; (FOO_REGION1_MMAP_OFFSET + FOO_REGION1_SIZE)) {
                if ((vaddr_t)foo_object1 &amp; PAGE_MASK)
                        panic("foommap: foo_object1 not page aligned");
                if (pmap_extract(pmap_kernel(), foo_object1 +
                    (u_int)off - FOO_REGION1_MMAP_OFFSET, &amp;pa) == FALSE)
                        panic("foommap: foo_object1 page not mapped");
                return (atop(pa));
        }

        if ((u_int)off &gt;= FOO_REGION2_MMAP_OFFSET &amp;&amp;
            (u_int)off &lt; (FOO_REGION2_MMAP_OFFSET + FOO_REGION2_SIZE)) {
                if ((vaddr_t)foo_object2 &amp; PAGE_MASK)
                        panic("foommap: foo_object2 not page aligned");
                if (pmap_extract(pmap_kernel(), foo_object2 +
                    (u_int)off - FOO_REGION2_MMAP_OFFSET, &amp;pa) == FALSE)
                        panic("foommap: foo_object2 page not mapped");
                return (atop(pa));
        }

        /* Page not found. */
        return (-1);
}</programlisting></para>
      </sect3>
      
      <sect3 id="accessing_a_kernel_structure_from_userland">
	<title>Accessing a kernel structure from userland</title>
	
	<para>The canonical example for this is:  <code>
	    <filename role="cvsweb">src/usr.bin/vmstat/dkstats.c</filename></code>
	  , which reads disk statistics.</para>
      </sect3>
      
      <sect3 id="sample_driver">
	<title>Is there a simple PCI driver I can use as an example?</title>
	<para>You can look at
	  <filename>sys/dev/pci/puc.c</filename>, which is one of
	  the simplest drivers. PUCs are devices with one or more
	  serial or parallel ports on it, usually using standard
	  chips (e.g. 16550 UART for serial). This driver just
	  locates the I/O addresses of the registers of the serial or
	  parallel controller and passes it to the serial or
	  parallel driver.</para> 
      </sect3>
      
      <sect3 id="other-related-links">
	<title>Other related links</title>
	
<itemizedlist>
	  <listitem>&man.driver.9; - NetBSD autoconfiguration
	    interface utilised by device drivers</listitem>
	  <listitem>&man.autoconf.9; - General description on the
	    NetBSD autoconfiguration framework</listitem>
	  <listitem>&man.config.9; - The autoconfiguration
	    framework ``device definition'' language</listitem>
	  <listitem>&man.bus.dma.9; - NetBSD's bus and machine
	    independent DMA framework, described in its own <ulink
	      url="bus_dma.pdf">paper</ulink> (64k, PDF)</listitem>
	  <listitem>&man.bus.space.9; - NetBSD's bus space
	    manipulation interface</listitem>
	  <listitem><ulink url="scsidma.html">How SCSI DMA
	      works</ulink> - by Tohru Nishimura</listitem>
	  <listitem><ulink url="lazyfpu.html">How lazy FPU context
	      switch works</ulink> - by Tohru Nishimura</listitem>
	  <listitem><ulink 
	      url="converting-ethernet-drivers.html">Converting ancient BSD Ethernet drivers to NetBSD-1.2D and later</ulink></listitem>
	  <listitem><ulink 
	      url="porting-freebsd-net.html">Notes on porting FreeBSD network drivers to NetBSD</ulink></listitem>
	</itemizedlist>
      </sect3>
    </sect2>
  </sect1>
  
  <parentsec url="./" text="NetBSD Documentation: Kernel"/>
  
</webpage>

