Android QEMU FAST PIPES
=======================

Introduction:
-------------

The Android emulator implements a special virtual device used to provide
_very_ fast communication channels between the guest system and the
emulator itself.

From the guest, usage is simply as follows:

  1/ Open the /dev/qemu_pipe device for read+write

     NOTE: Starting with Linux 3.10, the device was renamed as
           /dev/goldfish_pipe but behaves exactly in the same way.

  2/ Write a zero-terminated string describing which service you want to
     connect.

  3/ Simply use read() and write() to communicate with the service.

In other words:

   fd = open("/dev/qemu_pipe", O_RDWR);
   const char* pipeName = "<pipename>";
   ret = write(fd, pipeName, strlen(pipeName)+1);
   if (ret < 0) {
       // error
   }
   ... ready to go

Where <pipename> is the name of a specific emulator service you want to use.
Supported service names are listed later in this document.


Implementation details:
-----------------------

In the emulator source tree:

    ./hw/android/goldfish/pipe.c implements the virtual driver.

    ./hw/android/goldfish/pipe.h provides the interface that must be
    implemented by any emulator pipe service.

    ./android/hw-pipe-net.c contains the implementation of the network pipe
    services (i.e. 'tcp' and 'unix'). See below for details.

In the kernel source tree:

    drivers/misc/qemupipe/qemu_pipe.c contains the driver source code
    that will be accessible as /dev/qemu_pipe within the guest.


Device / Driver Protocol details:
---------------------------------

The device and driver use an I/O memory page and an IRQ to communicate.

  - The driver writes to various I/O registers to send commands to the
    device.

  - The device raises an IRQ to instruct the driver that certain events
    occured.

  - The driver reads I/O registers to get the status of its latest command,
    or the list of events that occured in case of interrupt.

Each opened file descriptor to /dev/qemu_pipe in the guest corresponds to a
32-bit 'channel' value allocated by the driver.

The following is a description of the various commands sent by the driver
to the device. Variable names beginning with REG_ correspond to 32-bit I/O
registers:

  0/ Channel and address values:

     Each communication channel is identified by a unique non-zero value
     which is either 32-bit or 64-bit, depending on the guest CPU
     architecture.

     The channel value sent from the kernel to the emulator with:

        void write_channel(channel) {
        #if 64BIT_GUEST_CPU
          REG_CHANNEL_HIGH = (channel >> 32);
        #endif
          REG_CHANNEL = (channel & 0xffffffffU);
        }

    Similarly, when passing a kernel address to the emulator:

        void write_address(buffer_address) {
        #if 64BIT_GUEST_CPU
          REG_ADDRESS_HIGH = (buffer_address >> 32);
        #endif
          REG_ADDRESS = (buffer_address & 0xffffffffU);
        }

  1/ Creating a new channel:

     Used by the driver to indicate that the guest just opened /dev/qemu_pipe
     that will be identified by a named '<channel>':

        write_channel(<channel>)
        REG_CMD = CMD_OPEN

     IMPORTANT: <channel> should never be 0

  2/ Closing a channel:

     Used by the driver to indicate that the guest called 'close' on the
     channel file descriptor.

        write_channel(<channel>)
        REG_CMD = CMD_CLOSE

  3/ Writing data to the channel:

     Corresponds to when the guest does a write() or writev() on the
     channel's file descriptor. This command is used to send a single
     memory buffer:

        write_channel(<channel>)
        write_address(<buffer-address>)
        REG_SIZE    = <buffer-size>
        REG_CMD     = CMD_WRITE_BUFFER

        status = REG_STATUS

    NOTE: The <buffer-address> is the *GUEST* buffer address, not the
          physical/kernel one.

    IMPORTANT: The buffer sent through this command SHALL ALWAYS be entirely
               contained inside a single page of guest memory. This is
               enforced to simplify both the driver and the device.

               When a write() spans several pages of guest memory, the
               driver will issue several CMD_WRITE_BUFFER commands in
               succession, transparently to the client.

    The value returned by REG_STATUS should be:

       > 0    The number of bytes that were written to the pipe
       0      To indicate end-of-stream status
       < 0    A negative error code (see below).

    On important error code is PIPE_ERROR_AGAIN, used to indicate that
    writes can't be performed yet. See CMD_WAKE_ON_WRITE for more.

  4/ Reading data from the channel:

    Corresponds to when the guest does a read() or readv() on the
    channel's file descriptor.

        write_channel(<channel>)
        write_address(<buffer-address>)
        REG_SIZE    = <buffer-size>
        REG_CMD     = CMD_READ_BUFFER

        status = REG_STATUS

    Same restrictions on buffer addresses/lengths and same set of error
    codes.

  5/ Waiting for write ability:

    If CMD_WRITE_BUFFER returns PIPE_ERROR_AGAIN, and the file descriptor
    is not in non-blocking mode, the driver must put the client task on a
    wait queue until the pipe service can accept data again.

    Before this, the driver will do:

        write_channel(<channel>)
        REG_CMD     = CMD_WAKE_ON_WRITE

    To indicate to the virtual device that it is waiting and should be woken
    up when the pipe becomes writable again. How this is done is explained
    later.

  6/ Waiting for read ability:

    This is the same than CMD_WAKE_ON_WRITE, but for readability instead.

        write_channel(<channel>)
        REG_CMD     = CMD_WAKE_ON_READ

  7/ Polling for write-able/read-able state:

    The following command is used by the driver to implement the select(),
    poll() and epoll() system calls where a pipe channel is involved.

        write_channel(<channel>)
        REG_CMD     = CMD_POLL
        mask = REG_STATUS

    The mask value returned by REG_STATUS is a mix of bit-flags for
    which events are available / have occured since the last call.
    See PIPE_POLL_READ / PIPE_POLL_WRITE / PIPE_POLL_CLOSED.

  8/ Signaling events to the driver:

    The device can signal events to the driver by raising its IRQ.
    The driver's interrupt handler will then have to read a list of
    (channel,mask) pairs, terminated by a single 0 value for the channel.

    In other words, the driver's interrupt handler will do:

        for (;;) {
            channel = REG_CHANNEL
            if (channel == 0)  // END OF LIST
                break;

            mask = REG_WAKES  // BIT FLAGS OF EVENTS
            ... process events
        }

    The events reported through this list are simply:

       PIPE_WAKE_READ   :: the channel is now readable.
       PIPE_WAKE_WRITE  :: the channel is now writable.
       PIPE_WAKE_CLOSED :: the pipe service closed the connection.

    The PIPE_WAKE_READ and PIPE_WAKE_WRITE are only reported for a given
    channel if CMD_WAKE_ON_READ or CMD_WAKE_ON_WRITE (respectively) were
    issued for it.

    PIPE_WAKE_CLOSED can be signaled at any time.


  9/ Faster read/writes through parameter blocks:

    Recent Goldfish kernels implement a faster way to perform reads and writes
    that perform a single I/O write per operation (which is useful when
    emulating x86 system through KVM or HAX).

    This uses the following structure known to both the virtual device and
    the kernel, defined in $QEMU/hw/android/goldfish/pipe.h:

    For 32-bit guest CPUs:

        struct access_params {
            uint32_t channel;
            uint32_t size;
            uint32_t address;
            uint32_t cmd;
            uint32_t result;
            /* reserved for future extension */
            uint32_t flags;
        };

    And the 64-bit variant:

        struct access_params_64 {
            uint64_t channel;
            uint32_t size;
            uint64_t address;
            uint32_t cmd;
            uint32_t result;
            /* reserved for future extension */
            uint32_t flags;
        };

    This is simply a way to pack several parameters into a single structure.
    Preliminary, e.g. at boot time, the kernel will allocate one such structure
    and pass its physical address with:

       PARAMS_ADDR_LOW  = (params & 0xffffffff);
       PARAMS_ADDR_HIGH = (params >> 32) & 0xffffffff;

    Then for each operation, it will do something like:

        params.channel = channel;
        params.address = buffer;
        params.size = buffer_size;
        params.cmd = CMD_WRITE_BUFFER (or CMD_READ_BUFFER)

        REG_ACCESS_PARAMS = <any>

        status = params.status

    The write to REG_ACCESS_PARAMS will trigger the operation, i.e. QEMU will
    read the content of the params block, use its fields to perform the
    operation then write back the return value into params.status.


Available services:
-------------------

  tcp:<port>

     Open a TCP socket to a given localhost port. This provides a very fast
     pass-through that doesn't depend on the very slow internal emulator
     NAT router. Note that you can only use the file descriptor with read()
     and write() though, send() and recv() will return an ENOTSOCK error,
     as well as any socket ioctl().

     For security reasons, it is not possible to connect to non-localhost
     ports.

  unix:<path>

     Open a Unix-domain socket on the host.

  opengles

     Connects to the OpenGL ES emulation process. For now, the implementation
     is equivalent to tcp:22468, but this may change in the future.

  qemud

     Connects to the QEMUD service inside the emulator. This replaces the
     connection that was performed through /dev/ttyS1 in older Android platform
     releases. See $QEMU/docs/ANDROID-QEMUD.TXT for details.