<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-us">
<link rel="stylesheet" href="http://www.unicode.org/reports/reports.css"
type="text/css">
<title>UTS #35: Unicode LDML: Keyboards</title>
<style type="text/css">
<!--
.dtd {
font-family: monospace;
font-size: 90%;
background-color: #CCCCFF;
border-style: dotted;
border-width: 1px;
}
.xmlExample {
font-family: monospace;
font-size: 80%
}
.blockedInherited {
font-style: italic;
font-weight: bold;
border-style: dashed;
border-width: 1px;
background-color: #FF0000
}
.inherited {
font-weight: bold;
border-style: dashed;
border-width: 1px;
background-color: #00FF00
}
.element {
font-weight: bold;
color: red;
}
.attribute {
font-weight: bold;
color: maroon;
}
.attributeValue {
font-weight: bold;
color: blue;
}
li, p {
margin-top: 0.5em;
margin-bottom: 0.5em
}
h2, h3, h4, table {
margin-top: 1.5em;
margin-bottom: 0.5em;
}
-->
</style>
</head>
<body>
<table class="header" width="100%">
<tr>
<td class="icon"><a href="http://unicode.org"> <img
alt="[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif"
width="34" height="33"
style="vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>
<a class="bar" href="http://www.unicode.org/reports/">Technical
Reports</a></td>
</tr>
<tr>
<td class="gray"> </td>
</tr>
</table>
<div class="body">
<h2 style="text-align: center">
Unicode Technical
Standard #35
</h2>
<h1>
Unicode Locale Data Markup Language (LDML)<br>Part 7: Keyboards
</h1>
<!-- At least the first row of this header table should be identical across the parts of this UTS. -->
<table border="1" cellpadding="2" cellspacing="0" class="wide">
<tr>
<td>Version</td>
<td>34</td>
</tr>
<tr>
<td>Editors</td>
<td>Steven Loomis (<a href="mailto:srl@icu-project.org">srl@icu-project.org</a>)
and <a href="tr35.html#Acknowledgments">other CLDR committee
members</a></td>
</tr>
</table>
<p>
For the full header, summary, and status, see <a href="tr35.html">
Part 1: Core</a>
</p>
<h3>
<i>Summary</i>
</h3>
<p>
This document describes parts of an XML format (<i>vocabulary</i>)
for the exchange of structured locale data. This format is used in
the <a href="http://cldr.unicode.org/">Unicode Common Locale Data
Repository</a>.
</p>
<p>
This is a partial document, describing keyboard mappings. For the
other parts of the LDML see the <a href="tr35.html">main LDML
document</a> and the links above.
</p>
<h3>
<i>Status</i>
</h3>
<!-- NOT YET APPROVED
<p>
<i class="changed">This is a<b><font color="#ff3333">
draft </font></b>document which may be updated, replaced, or superseded by
other documents at any time. Publication does not imply endorsement
by the Unicode Consortium. This is not a stable document; it is
inappropriate to cite this document as other than a work in
progress.
</i>
</p>
END NOT YET APPROVED -->
<!-- APPROVED -->
<p>
<i>This document has been reviewed by Unicode members and other
interested parties, and has been approved for publication by the
Unicode Consortium. This is a stable document and may be used as
reference material or cited as a normative reference by other
specifications.</i>
</p>
<!-- END APPROVED -->
<blockquote>
<p>
<i><b>A Unicode Technical Standard (UTS)</b> is an independent
specification. Conformance to the Unicode Standard does not imply
conformance to any UTS.</i>
</p>
</blockquote>
<p>
<i>Please submit corrigenda and other comments with the CLDR bug
reporting form [<a href="tr35.html#Bugs">Bugs</a>]. Related
information that is useful in understanding this document is found
in the <a href="tr35.html#References">References</a>. For the latest
version of the Unicode Standard see [<a href="tr35.html#Unicode">Unicode</a>].
For a list of current Unicode Technical Reports see [<a
href="tr35.html#Reports">Reports</a>]. For more information about
versions of the Unicode Standard, see [<a href="tr35.html#Versions">Versions</a>].
</i>
</p>
<h2>
<a name="Parts" href="#Parts">Parts</a>
</h2>
<!-- This section of Parts should be identical in all of the parts of this UTS. -->
<p>The LDML specification is divided into the following parts:</p>
<ul class="toc">
<li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
locales, basic structure)
</li>
<li>Part 2: <a href="tr35-general.html#Contents">General</a>
(display names & transforms, etc.)
</li>
<li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
(number & currency formatting)
</li>
<li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
time, time zone formatting)
</li>
<li>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
(sorting, searching, grouping)
</li>
<li>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
(supplemental data)
</li>
<li>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
(keyboard mappings)
</li>
</ul>
<h2>
<a name="Contents" href="#Contents">Contents of Part 7, Keyboards</a>
</h2>
<!-- START Generated TOC: CheckHtmlFiles -->
<ul class="toc">
<li>1 <a href="#Introduction">Keyboards</a></li>
<li>2 <a href="#Goals_and_Nongoals">Goals and Nongoals</a></li>
<li>3 <a href="#Definitions">Definitions</a></li>
<li>4 <a href="#File_and_Dir_Structure">File and Directory
Structure</a></li>
<li>5 <a href="#Element_Heirarchy_Layout_File">Element
Hierarchy - Layout File</a>
<ul class="toc">
<li>5.1 <a href="#Element_Keyboard">Element: keyboard</a></li>
<li>5.2 <a href="#Element_version">Element: version</a></li>
<li>5.3 <a href="#Element_generation">Element: generation</a></li>
<li>5.4 <a href="#Element_names">Element: names</a></li>
<li>5.5 <a href="#Element_name">Element: name</a></li>
<li>5.6 <a href="#Element_settings">Element: settings</a></li>
<li>5.7 <a href="#Element_keyMap">Element: keyMap</a>
<ul class="toc">
<li>Table: <a href="#Possible_Modifier_Keys">Possible
Modifier Keys</a></li>
</ul>
</li>
<li>5.8 <a href="#Element_map">Element: map</a></li>
<li>5.9 <a href="#Element_import">Element:
import</a></li>
<li>5.10 <a href="#Element_displayMap">Element:
displayMap</a></li>
<li>5.11 <a href="#Element_display">Element:
display</a></li>
<li>5.12 <a href="#Element_layer">Element:
layer</a></li>
<li>5.13 <a href="#Element_row">Element:
row</a></li>
<li>5.14 <a href="#Element_switch">Element:
switch</a></li>
<li>5.15 <a href="#Element_vkeys">Element:
vkeys</a></li>
<li>5.16 <a href="#Element_vkey">Element:
vkey</a></li>
<li>5.17 <a href="#Element_transforms">Element:
transforms</a></li>
<li>5.18 <a href="#Element_transform">Element:
transform</a></li>
<li>5.19 <a href="#Element_reorder">Element:
reorder</a></li>
<li>5.20 <a href="#Element_final">Element:
final</a></li>
<li>5.21 <a href="#Element_backspaces">Element:
backspaces</a></li>
<li>5.22 <a href="#Element_backspace">Element:
backspace</a></li>
</ul>
</li>
<li>6 <a href="#Element_Heirarchy_Platform_File">Element
Hierarchy - Platform File</a>
<ul class="toc">
<li>6.1 <a href="#Element_platform">Element: platform</a></li>
<li>6.2 <a href="#Element_hardwareMap">Element:
hardwareMap</a></li>
<li>6.3 <a href="#Element_hardwareMap_map">Element: map</a></li>
</ul>
</li>
<li>7 <a href="#Invariants">Invariants</a></li>
<li>8 <a href="#Data_Sources">Data Sources</a>
<ul class="toc">
<li>Table: <a href="#Key_Map_Data_Sources">Key Map Data
Sources</a></li>
</ul>
</li>
<li>9 <a href="#Keyboard_IDs">Keyboard IDs</a>
<ul class="toc">
<li>9.1 <a href="#Principles_for_Keyboard_Ids">Principles
for Keyboard Ids</a></li>
</ul>
</li>
<li>10 <a href="#Platform_Behaviors_in_Edge_Cases">Platform
Behaviors in Edge Cases</a></li>
</ul>
<!-- END Generated TOC: CheckHtmlFiles -->
<h2>
1 <a name="Introduction" href="#Introduction">Keyboards</a><a
name="Keyboards" href="#Keyboards"></a>
</h2>
<p>The CLDR keyboard format provides for the communication of
keyboard mapping data between different modules, and the comparison
of data across different vendors and platforms. The standardized
identifier for keyboards can be used to communicate, internally or
externally, a request for a particular keyboard mapping that is to be
used to transform either text or keystrokes. The corresponding data
can then be used to perform the requested actions.</p>
<p>For example, a web-based virtual keyboard may transform text in
the following way. Suppose the user types a key that produces a
"W" on a qwerty keyboard. A web-based tool using an azerty
virtual keyboard can map that text ("W") to the text that
would have resulted from typing a key on an azerty keyboard, by
transforming "W" to "Z". Such transforms are in
fact performed in existing web applications.</p>
<p>The data can also be used in analysis of the capabilities of
different keyboards. It also allows better interoperability by making
it easier for keyboard designers to see which characters are
generally supported on keyboards for given languages.</p>
<p>To illustrate this specification, here is an abridged layout
representing the English US 101 keyboard on the Mac OSX operating
system (with an inserted long-press example). For more complete
examples, and information collected about keyboards, see keyboard
data in XML.</p>
<pre><keyboard locale="en-t-k0-osx"><br> <version platform="10.4" number="$Revision: 8294 $" /><br> <names><br> <name value="U.S." /><br> </names><br> <keyMap><br> <map iso="E00" to="`" /><br> <map iso="E01" to="1" /><br> <map iso="D01" to="q" /><br> <map iso="D02" to="w" /><br> <map iso="D03" to="e" longPress="é è ê ë" /><br> …<br> </keyMap><br> <keyMap modifiers="caps"><br> <map iso="E00" to="`" /><br> <map iso="E01" to="1" /><br> <map iso="D01" to="Q" /><br> <map iso="D02" to="W" /><br> …<br> </keyMap><br> <keyMap modifiers="opt"><br> <map iso="E00" to="`" /><br> <map iso="E01" to="¡" /> <!-- key=1 --><br> <map iso="D01" to="œ" /> <!-- key=Q --><br> <map iso="D02" to="∑" /> <!-- key=W --><br> …<br> </keyMap><br> <transforms type="simple"><br> <transform from="` " to="`" /><br> <transform from="`a" to="à" /><br> <transform from="`A" to="À" /><br> <transform from="´ " to="´" /><br> <transform from="´a" to="á" /><br> <transform from="´A" to="Á" /><br> <transform from="˜ " to="˜" /><br> <transform from="˜a" to="ã" /><br> <transform from="˜A" to="Ã" /><br> …<br> </transforms><br></keyboard></pre>
<p>And its associated platform file (which includes the hardware
mapping):</p>
<pre><platform id="osx"><br> <hardwareMap><br> <map keycode="0" iso="C01" /><br> <map keycode="1" iso="C02" /><br> <map keycode="6" iso="B01" /><br> <map keycode="7" iso="B02" /><br> <map keycode="12" iso="D01" /><br> <map keycode="13" iso="D02" /><br> <map keycode="18" iso="E01" /><br> <map keycode="50" iso="E00" /><br> </hardwareMap><br></platform></pre>
<h2>
2 <a name="Goals_and_Nongoals" href="#Goals_and_Nongoals">Goals
and Nongoals</a>
</h2>
<p>Some goals of this format are:</p>
<ol>
<li>Make the XML as readable as possible.</li>
<li>Represent faithfully keyboard data from major platforms: it
should be possible to create a functionally-equivalent data file
(such that given any input, it can produce the same output).</li>
<li>Make as much commonality in the data across platforms as
possible to make comparison easy.</li>
</ol>
<p>Some non-goals (outside the scope of the format) currently are:</p>
<ol>
<li>Display names or symbols for keycaps (eg, the German name
for "Return"). If that were added to LDML, it would be in a
different structure, outside the scope of this section.</li>
<li>Advanced IME features, handwriting recognition, etc.</li>
<li>Roundtrip mappings—the ability to recover precisely the same
format as an original platform's representation. In particular, the
internal structure may have no relation to the internal structure of
external keyboard source data, the only goal is functional
equivalence.</li>
<li>More sophisticated transforms, such as for Indic character
rearrangement. It is anticipated that these would be added to a
future version, after working out a reasonable representation.</li>
</ol>
<p>Note: During development of this section, it was considered
whether the modifier RAlt (=AltGr) should be merged with Option. In
the end, they were kept separate, but for comparison across platforms
implementers may choose to unify them.</p>
<p>
Note that in parts of this document, the format <strong>@x</strong>
is used to indicate the <em>attribute</em> <strong>x</strong>.
</p>
<h2>
3 <a name="Definitions" href="#Definitions">Definitions</a>
</h2>
<p>
<b>Arrangement</b> is the term used to describe the relative position
of the rectangles that represent keys, either physically or
virtually. A physical keyboard has a static arrangement while a
virtual keyboard may have a dynamic arrangement that changes per
language and/or layer. While the arrangement of keys on a keyboard
may be fixed, the mapping of those keys may vary.
</p>
<p>
<b>Base character:</b> The character emitted by a particular key when
no modifiers are active. In ISO terms, this is group 1, level 1.
</p>
<p>
<b>Base map:</b> A mapping from the ISO positions to the base
characters. There is only one base map per layout. The characters on
this map can be output by not using any modifier keys.
</p>
<p>
<b>Core keyboard layout:</b> also known as “alpha” block. The primary
set of key values on a keyboard that are used for typing the target
language of the keyboard. For example, the three rows of letters on a
standard US QWERTY keyboard (QWERTYUIOP, ASDFGHJKL, ZXCVBNM) together
with the most significant punctuation keys. Usually this equates to
the minimal keyset for a language as seen on mobile phone keyboards.
</p>
<p>
<b>Hardware map:</b> A mapping between key codes and ISO layout
positions.
</p>
<p>
<b>Input Method Editor (IME):</b> a component or program that
supports input of large character sets. Typically, IMEs employ
contextual logic and candidate UI to identify the Unicode characters
intended by the user.
</p>
<p>
<b>ISO position:</b> The corresponding position of a key using the
ISO layout convention where rows are identified by letters and
columns are identified by numbers. For example, "D01" corresponds to
the "Q" key on a US keyboard. For the purposes of this document, an
ISO layout position is depicted by a one-letter row identifier
followed by a two digit column number (like "B03", "E12" or "C00").
The following diagram depicts a typical US keyboard layout
superimposed with the ISO layout indicators (it is important to note
that the number of keys and their physical placement relative to
each-other in this diagram is irrelevant, rather what is important is
their logical placement using the ISO convention):<img
src="images/keyPositions.png"
alt="keyboard layout example showing ISO key numbering">
</p>
<p>One may also extend the notion of the ISO layout to support
keys that don't map directly to the diagram above (such as the
Android device - see diagram). Per the ISO standard, the space bar is
mapped to "A03", so the period and comma keys are mapped to "A02" and
"A04" respectively based on their relative position to the space bar.
Also note that the "E" row does not exist on the Android keyboard.</p>
<p>
<img src="images/androidKeyboard.png"
alt="keyboard layout example showing extension of ISO key numbering">
</p>
<p>If it becomes necessary in the future, the format could extend
the ISO layout to support keys that are located to the left of the
"00" column by using negative column numbers "-01", "-02" and so on,
or 100's complement "99", "98",...</p>
<p>
<b>Key:</b> A key on a physical keyboard.
</p>
<p>
<b>Key code:</b> The integer code sent to the application on pressing
a key.
</p>
<p>
<b>Key map:</b> The basic mapping between ISO positions and the
output characters for each set of modifier combinations associated
with a particular layout. There may be multiple key maps for each
layout.
</p>
<p>
<b>Keyboard:</b> The physical keyboard.
</p>
<p>
<b>Keyboard layout:</b> A layout is the
overall keyboard configuration for a particular locale. Within a
keyboard layout, there is a single base map, one or more key maps and
zero or
more transforms.
</p>
<p>
<b>Layer</b> is an arrangement of keys on a virtual keyboard. Since
it is often not intended to use two hands on a visual keyboard to
allow the pressing of modifier keys. Modifier keys are made sticky in
that one presses one, the visual representation, and even
arrangement, of the keys change, and you press the key. This visual
representation is a layer. Thus a virtual keyboard is made up of a
set of layers.
</p>
<p>
<b>Long-press key:</b> also known as a “child key”. A secondary key
that is invoked from a top level key on a software keyboard.
Secondary keys typically provide access to variants of the top level
key, such as accented variants (a => á, à, ä, ã)
</p>
<p>
<b>Modifier:</b> A key that is held to change the behavior of a
keyboard. For example, the "Shift" key allows access to upper-case
characters on a US keyboard. Other modifier keys include but is not
limited to: Ctrl, Alt, Option, Command and Caps Lock.
</p>
<p>
<b>Physical keyboard</b> is a keyboard that has individual keys that
are pressed. Each key has a unique identifier and the arrangement
doesn't change, even if the mapping of those keys does.
</p>
<p>
<b>Transform:</b>A transform is an
element that specifies a set of conversions from sequences of code
points into one (or more) other code points. For example, in most
latin keyboards hitting the "^" dead-key followed by the "e" key
produces "ê".
</p>
<p>
<b>Virtual keyboard</b> is a keyboard that is rendered on a,
typically, touch surface. It has a dynamic arrangement and contrasts
with a physical keyboard. This term has many synonyms: touch
keyboard, software keyboard, SIP (Software Input Panel). This
contrasts with other uses of the term virtual keyboard as an
on-screen keyboard for reference or accessibility data entry.
</p>
<h2>
4 <a name="File_and_Dir_Structure" href="#File_and_Dir_Structure">File
and Directory Structure</a>
</h2>
<p>Each platform has its own directory, where a "platform" is a
designation for a set of keyboards available from a particular
source, such as Windows or Chromeos. This directory name is the
platform name (see Table 2 located further in the document). Within
this directory there are two types of files:</p>
<ol>
<li>A single platform file (see XML structure for Platform
file), this file includes a mapping of hardware key codes to the ISO
layout positions. This file is also open to expansion for any
configuration elements that are valid across the whole platform and
that are not layout specific. This file is simply called
_platform.xml.</li>
<li>Multiple layout files named by their locale identifiers.
(eg. lt-t-k0-chromeos.xml or ne-t-k0-windows.xml).</li>
</ol>
<p>Keyboard data that is not supported on a given platform, but
intended for use with that platform, may be added to the directory
/und/. For example, there could be a file /und/lt-t-k0-chromeos.xml,
where the data is intended for use with ChromeOS, but does not
reflect data that is distributed as part of a standard ChromeOS
release.</p>
<h2>
5 <a name="Element_Heirarchy_Layout_File"
href="#Element_Heirarchy_Layout_File">Element Hierarchy - Layout
File</a>
</h2>
<h3>
5.1 <a name="Element_Keyboard" href="#Element_Keyboard">Element:
keyboard</a>
</h3>
<p>This is the top level element. All other elements defined below
are under this element.</p>
<p>Syntax</p>
<p><keyboard locale="{locale ID}"></p>
<p>{definition of the layout as described by the elements defined
below}</p>
<p></keyboard></p>
<dl>
<dt>Attribute: locale (required)</dt>
<dd>
This mandatory attribute represents the locale of the keyboard using
Unicode locale identifiers (see <a href="tr35.html">LDML</a>) - for
example 'el' for Greek. Sometimes, the locale may not
specify the base language. For example, a Devanagari keyboard for
many languages could be specified by BCP-47 code: 'und-Deva'. For
details, see <a href="#Keyboard_IDs">Keyboard IDs</a> .
</dd>
</dl>
<p>Examples (for illustrative purposes only, not indicative of the
real data)</p>
<pre><keyboard locale="ka-t-k0-qwerty-windows">
…
</keyboard>
<keyboard locale="fr-CH-t-k0-android">
…
</keyboard></pre>
<hr>
<h3>
5.2 <a name="Element_version" href="#Element_version">Element:
version</a>
</h3>
<p>
Element used to keep track of the source data version.<br> <br>
Syntax
</p>
<p>
<version platform=".." revision=".."><br>
</p>
<dl>
<dt>Attribute: platform (required)</dt>
<dd>The platform source version. Specifies what version of the
platform the data is from. For example, data from Mac OSX 10.4 would
be specified as platform="10.4". For platforms that have
unstable version numbers which change frequently (like Linux), this
field is set to an integer representing the iteration of the data
starting with "1". This number would only increase if there were any
significant changes in the keyboard data.</dd>
</dl>
<dl>
<dt>Attribute: number (required)</dt>
<dd>The data revision version.</dd>
</dl>
<dl>
<dt>Attribute: cldrVersion (fixed by DTD)</dt>
<dd>The CLDR specification version that is associated with this
data file. This value is fixed and is inherited from the DTD file
and therefore does not show up directly in the XML file.</dd>
</dl>
<p>Example</p>
<p><keyboard locale="..-osx"></p>
<p>…</p>
<p><version platform="10.4" number="1"/></p>
<p>…</p>
<p></keyboard></p>
<hr>
<h3>
5.3 <a name="Element_generation" href="#Element_generation">Element:
generation</a>
</h3>
<p>
The generation element is now deprecated. It was used to keep track
of the generation date of the data.
</p>
<hr>
<h3>
5.4 <a name="Element_names" href="#Element_names">Element: names</a>
</h3>
<p>
Element used to store any names given to the layout by the platform.<br>
<br> Syntax
</p>
<p><names></p>
<p>{set of name elements}</p>
<p>
</names><br>
</p>
<h3>
5.5 <a name="Element_name" href="#Element_name">Element: name</a>
</h3>
<p>
A single name given to the layout by the platform.<br> <br>
Syntax
</p>
<p>
<name value=".."><br>
</p>
<dl>
<dt>Attribute: value (required)</dt>
<dd>The name of the layout.</dd>
</dl>
<p>Example</p>
<p><keyboard
locale="bg-t-k0-windows-phonetic-trad"></p>
<p>…</p>
<p><names></p>
<p><name value="Bulgarian (Phonetic
Traditional)"/></p>
<p></names></p>
<p>…</p>
<p></keyboard></p>
<hr>
<h3>
5.6 <a name="Element_settings" href="#Element_settings">Element:
settings</a>
</h3>
<p>
An element used to keep track of layout specific settings. This
element may or may not show up on a layout. These settings reflect
the normal practice on the platform. However, an implementation using
the data may customize the behavior. For example, for
transformFailures the implementation could ignore the setting, or
modify the text buffer in some other way (such as by emitting
backspaces).<br> <br> Syntax
</p>
<p>
<settings [fallback="omit"]
[transformFailure="omit"]
[transformPartial="hide"]><br>
</p>
<dl>
<dt>Attribute: fallback="omit" (optional)</dt>
<dd>The presence of this attribute means that when a modifier
key combination goes unmatched, no output is produced. The default
behavior (when this attribute is not present) is to fallback to the
base map when the modifier key combination goes unmatched.</dd>
</dl>
<p>If this attribute is present, it must have a value of omit.</p>
<dl>
<dt>Attribute: transformFailure="omit" (optional)</dt>
<dd>This attribute describes the behavior of a transform when it
is escaped (see the transform element in the Layout file for more
information). A transform is escaped when it can no longer continue
due to the entry of an invalid key. For example, suppose the
following set of transforms are valid:</dd>
</dl>
<blockquote>
<p>^e → ê</p>
<p>^a → â</p>
</blockquote>
<p>Suppose a user now enters the "^" key then "^" is now stored in
a buffer and may or may not be shown to the user (see the partial
attribute).</p>
<p>If a user now enters d, then the transform has failed and there
are two options for output.</p>
<p>1. default behavior - "^d"</p>
<p>2. omit - "" (nothing and the buffer is cleared)</p>
<p>The default behavior (when this attribute is not present) is to
emit the contents of the buffer upon failure of a transform.</p>
<p>If this attribute is present, it must have a value of omit.</p>
<dl>
<dt>Attribute: transformPartial="hide" (optional)</dt>
<dd>This attribute describes the behavior the system while in a
transform. When this attribute is present then don't show the values
of the buffer as the user is typing a transform (this behavior can
be seen on Windows or Linux platforms).</dd>
</dl>
<p>By default (when this attribute is not present), show the
values of the buffer as the user is typing a transform (this behavior
can be seen on the Mac OSX platform).</p>
<p>If this attribute is present, it must have a value of hide.</p>
<p>Example</p>
<p><keyboard
locale="bg-t-k0-windows-phonetic-trad"></p>
<p>…</p>
<p><settings fallback="omit"
transformPartial="hide"></p>
<p>…</p>
<p></keyboard></p>
<p>Indicates that:</p>
<ol>
<li>When a modifier combination goes unmatched, do not output
anything when a key is pressed.</li>
<li>If a transform is escaped, output the contents of the
buffer.</li>
<li>During a transform, hide the contents of the buffer as the
user is typing.</li>
</ol>
<hr>
<h3>
5.7 <a name="Element_keyMap" href="#Element_keyMap">Element:
keyMap</a>
</h3>
<p>This element defines the group of mappings for all the keys
that use the same set of modifier keys. It contains one or more map
elements.</p>
<p>Syntax</p>
<p><keyMap [modifiers="{Set of Modifier
Combinations}"]></p>
<p>{a set of map elements}</p>
<p></keyMap></p>
<dl>
<dt>Attribute: modifiers (optional)</dt>
<dd>
A set of modifier combinations that cause this key map to be
"active". Each combination is separated by a space. The
interpretation is that there is a match if any of the combinations
match, that is, they are ORed. Therefore, the order of the
combinations within this attribute does not matter.<br> <br>
A combination is simply a concatenation of words to represent the
simultaneous activation of one or more modifier keys. The order of
the modifier keys within a combination does not matter, although
don't care cases are generally added to the end of the string for
readability (see next paragraph). For example: "cmd+caps" represents
the Caps Lock and Command modifier key combination. Some keys have
right or left variant keys, specified by a 'R' or 'L' suffix. For
example: "ctrlR+caps" would represent the Right-Control and Caps
Lock combination. For simplicity, the presence of a modifier without
a 'R' or 'L' suffix means that either its left or right variants are
valid. So "ctrl+caps" represents the same as "ctrlL+ctrlR?+caps
ctrlL?+ctrlR+caps"
</dd>
</dl>
<p>A modifier key may be further specified to be in a "don't care"
state using the '?' suffix. The "don't care" state simply means that
the preceding modifier key may be either ON or OFF. For example
"ctrl+shift?" could be expanded into "ctrl ctrl+shift".</p>
<p>Within a combination, the presence of a modifier WITHOUT the
'?' suffix indicates this key MUST be on. The converse is also true,
the absence of a modifier key means it MUST be off for the
combination to be active.</p>
<p>Here is an exhaustive list of all possible modifier keys:</p>
<p>Possible Modifier Keys</p>
<table>
<caption>
<a name="Possible_Modifier_Keys" href="#Possible_Modifier_Keys">Possible
Modifier Keys</a>
</caption>
<tbody>
<tr>
<td><p>Modifier Keys</p></td>
<td> </td>
<td><p>Comments</p></td>
</tr>
<tr>
<td><p>altL</p></td>
<td><p>altR</p></td>
<td><p>xAlty → xAltR+AltL? xAltR?AltLy</p></td>
</tr>
<tr>
<td><p>ctrlL</p></td>
<td><p>ctrlR</p></td>
<td><p>ditto for Ctrl</p></td>
</tr>
<tr>
<td><p>shiftL</p></td>
<td><p>shiftR</p></td>
<td><p>ditto for Shift</p></td>
</tr>
<tr>
<td><p>optL</p></td>
<td><p>optR</p></td>
<td><p>ditto for Opt</p></td>
</tr>
<tr>
<td><p>caps</p></td>
<td> </td>
<td><p>Caps Lock</p></td>
</tr>
<tr>
<td><p>cmd</p></td>
<td> </td>
<td><p>Command on the Mac</p></td>
</tr>
</tbody>
</table>
<p>All sets of modifier combinations within a layout are disjoint
with no-overlap existing between the key maps. That is, for every
possible modifier combination, there is at most a single match within
the layout file. There are thus never multiple matches. If no exact
match is available, the match falls back to the base map unless the
fallback="omit" attribute in the settings element is set,
in which case there would be no output at all.</p>
<p>To illustrate, the following example produces an invalid layout
because pressing the "Ctrl" modifier key produces an indeterminate
result:</p>
<p><keyMap modifiers="ctrl+shift?"></p>
<p>…</p>
<p></keyMap></p>
<p><keyMap modifiers="ctrl"></p>
<p>…</p>
<p></keyMap></p>
<p>Modifier Examples:</p>
<p><keyMap modifiers="cmd?+opt+caps?+shift" /></p>
<p>Caps-Lock may be ON or OFF, Option must be ON, Shift must be ON
and Command may be ON or OFF.</p>
<p><keyMap modifiers="shift caps"
fallback="true" /></p>
<p>Caps-Lock must be ON OR Shift must be ON. Is also the fallback
key map.</p>
<p>If the modifiers attribute is not present on a keyMap then that
particular key map is the base map.</p>
<hr>
<h3>
5.8 <a name="Element_map" href="#Element_map">Element: map</a>
</h3>
<p>This element defines a mapping between the base character and
the output for a particular set of active modifier keys. This element
must have the keyMap element as its parent.</p>
<p>If a map element for a particular ISO layout position has not
been defined then if this key is pressed, no output is produced.</p>
<p>Syntax</p>
<pre><map
iso="{the iso position}"
to="{the output}"
[longPress="{long press keys}"]
[transform="no"]
/><!-- {Comment to improve readability (if needed)} --></pre>
<dl>
<dt>Attribute: iso (exactly one of base and iso is required)</dt>
<dd>The iso attribute represents the ISO layout position of the
key (see the definition at the beginning of the document for more
information).</dd>
</dl>
<dl>
<dt>Attribute: to (required)</dt>
<dd>The to attribute contains the output sequence of characters
that is emitted when pressing this particular key. Control
characters, whitespace (other than the regular space character) and
combining marks in this attribute are escaped using the \u{...}
notation.</dd>
</dl>
<dl>
<dt>Attribute: longPress (optional)</dt>
<dd>The longPress attribute contains any characters that can be
emitted by "long-pressing" a key, this feature is prominent in
mobile devices. The possible sequences of characters that can be
emitted are whitespace delimited. Control characters, combining
marks and whitespace (which is intended to be a long-press option)
in this attribute are escaped using the \u{...} notation.</dd>
</dl>
<dl>
<dt>Attribute: transform="no" (optional)</dt>
<dd>The transform attribute is used to define a key that never
participates in a transform but its output shows up as part of a
transform. This attribute is necessary because two different keys
could output the same characters (with different keys or modifier
combinations) but only one of them is intended to be a dead-key and
participate in a transform. This attribute value must be no if it is
present.</dd>
</dl>
<dl>
<dt>Attribute: multitap (optional)</dt>
<dd>
A space-delimited list of strings, where each successive element of the list is produced by the corresponding number of quick taps. For example, two taps on the key C01 will produce a “c” in the following example. <br>
<br> <em>Example:</em><br> <br>
<map iso="C01" to="a" multitap="bb c d"></dd>
</dl>
<dl>
<dt>Attribute: longPress-status (optional)</dt>
<dd>
Indicates optional longPress values. Must only occur with a
longPress value. May be suppressed or shown, depending on user
settings. There can be two map elements that differ only by
long-press-status, allowing two different sets of longpress values.<br>
<br> <em>Example:</em><br> <br> <map
iso="D01" to="a" longPress="à â % æ á ä ã å
ā ª"/><br> <map iso="D01" to="a"
longPress="à â á ä ã å ā"
longPress-status="optional"/>
</dd>
</dl>
<dl>
<dt>Attribute: optional (optional)</dt>
<dd>Indicates optional mappings. May be suppressed or shown,
depending on user settings.</dd>
</dl>
<dl>
<dt>Attribute: hint (optional)</dt>
<dd>
Indicates a hint as to long-press contents, such as the first
character of the longPress value, that can be displayed on the key.
May be suppressed or shown, depending on user Settings.<br> <br>
<i>Example:</i> where the hint is "{":<br>
<div style='text-align: center'>
<img alt="keycap hint" src='images/keycapHint.png'>
</div>
</dd>
</dl>
<p>For example, suppose there are the following keys, their output
and one transform:</p>
<blockquote>
<p>E00 outputs `</p>
<p>Option+E00 outputs ` (the dead-version which participates in
transforms).</p>
<p>`e → è</p>
</blockquote>
<p>Then the first key must be tagged with transform="no"
to indicate that it should never participate in a transform.</p>
<p>Comment: US key equivalent, base key, escaped output and
escaped longpress</p>
<p>In the generated files, a comment is included to help the
readability of the document. This comment simply shows the English
key equivalent (with prefix key=), the base character (base=), the
escaped output (to=) and escaped long-press keys (long=). These
comments have been inserted strategically in places to improve
readability. Not all comments include include all components since
some of them may be obvious.</p>
<p>Examples</p>
<pre><keyboard locale="fr-BE-t-k0-windows"><br> …<br> <keyMap modifiers="shift"><br> <map iso="D01" to="A" /> <!-- key=Q --><br> <map iso="D02" to="Z" /> <!-- key=W --><br> <map iso="D03" to="E" /><br> <map iso="D04" to="R" /><br> <map iso="D05" to="T" /><br> <map iso="D06" to="Y" /><br> …<br> </keyMap><br> …<br></keyboard><br><keyboard locale="ps-t-k0-windows"><br> …<br> <keyMap modifiers='altR+caps? ctrl+alt+caps?'><br> <map iso="D04" to="\u{200e}" /> <!-- key=R base=ق --><br> <map iso="D05" to="\u{200f}" /> <!-- key=T base=ف --><br> <map iso="D08" to="\u{670}" /> <!-- key=I base=ه to= ٰ --><br> …<br> </keyMap><br> …<br></keyboard></pre>
<h4>
5.8.1 <a name="Element_flicks" href="#Element_flicks">Elements:
flicks, flick</a></h4>
<p class='dtd'><!ELEMENT keyMap ( map | flicks )+ ><br>
<!ELEMENT flick EMPTY><br>
<!ATTLIST flick directions NMTOKENS><br>
<!ATTLIST flick to CDATA><br>
<!--@VALUE--></p>
<p>The flicks element is used to generate results from a "flick" of the finger on a mobile device. The <strong>directions</strong> attribute value is a space-delimited list of keywords, that describe a path, currently restricted to the cardinal and intercardinal directions {n e s w ne nw se sw}. The <strong>to</strong> attribute value is the result of (one or more) flicks.</p>
<p>Example: where a flick to the Northeast then South produces two code points.</p>
<pre><flicks iso="C01">
<flick directions=“ne s” to=“\uABCD\uDCBA”>
</flicks></pre>
<hr>
<h3>
5.9 <a name="Element_import" href="#Element_import">Element:
import</a>
</h3>
<p>The import element references another file of
the same type and includes all the subelements of the top level
element as though the import element were being replaced by those
elements, in the appropriate section of the XML file. For example:</p>
<pre> <import path="standard_transforms.xml"></pre>
<dl>
<dt>Attribute: path (required)</dt>
<dd>The value is contains a relative path to the included ldml
file. There is a standard set of directories to be searched that an
application may provide. This set is always prepended with the
directory in which the current file being read, is stored.</dd>
</dl>
<p>If two identical elements, as described below,
are defined, the later element will take precedence. Thus if a
hardwareMap/map for the same keycode on the same page is defined
twice (for example once in an included file), the later one will be
the resulting mapping.</p>
<p>Elements are considered to have three
attributes that make them unique: the tag of the element, the parent
and the identifying attribute. The parent in its turn is a unique
element and so on up the chain. If the distinguishing attribute is
optional, its non-existence is represented with an empty value. Here
is a list of elements and their defining attributes. If an element is
not listed then if it is a leaf element, only one occurs and it is
merely replaced. If it has children, then the sub elements are
considered, in effect merging the element in question.</p>
<table>
<!-- nocaption -->
<tbody>
<tr>
<td><p>Element</p></td>
<td><p>Parent</p></td>
<td><p>Distinguishing attribute</p></td>
</tr>
<tr>
<td><p>keyMap</p></td>
<td><p>keyboard</p></td>
<td><p>@modifiers</p></td>
</tr>
<tr>
<td><p>map</p></td>
<td><p>keyMap</p></td>
<td><p>@iso</p></td>
</tr>
<tr>
<td><p>display</p></td>
<td><p>displayMap</p></td>
<td><p>@char (new)</p></td>
</tr>
<tr>
<td><p>layout</p></td>
<td><p>layouts</p></td>
<td><p>@modifier</p></td>
</tr>
</tbody>
</table>
<p>In order to help identify mistakes, it is an
error if a file contains two elements that override each other. All
element overrides must come as a result of an <include> element
either for the element overridden or the element overriding.</p>
<p>The following elements are not imported from
the source file:</p>
<ul>
<li>version</li>
<li>generation</li>
<li>names</li>
<li>settings</li>
</ul>
<hr>
<h3>
5.10 <a name="Element_displayMap" href="#Element_displayMap">Element:
displayMap</a>
</h3>
<p>The displayMap can be used to describe what is
to be displayed on the keytops for various keys. For the most part,
such explicit information is unnecessary since the @char element from
the keyMap/map element can be used. But there are some characters,
such as diacritics, that do not display well on their own and so
explicit overrides for such characters can help. The displayMap
consists of a list of display sub elements.</p>
<p>DisplayMaps are designed to be shared across
many different keyboard layout descriptions, and included in where
needed.</p>
<hr>
<h3>
5.11 <a name="Element_display" href="#Element_display">Element:
display</a>
</h3>
<p>The display element describes how a character,
that has come from a keyMap/map element, should be displayed on a
keyboard layout where such display is possible.</p>
<dl>
<dt>Attribute: mapOutput (required)</dt>
<dd>Specifies the character or character sequence from the
keyMap/map element that is to have a special display.</dd>
</dl>
<dl>
<dt>Attribute: display (required)</dt>
<dd>Required and specifies the character sequence that should be
displayed on the keytop for any key that generates the @mapOutput
sequence. (It is an error if the value of the display attribute is
the same as the value of the char attribute.)</dd>
</dl>
<pre> <keyboard >
<keyboardMap>
<map iso="C01" to="a" longpress="\u0301 \u0300"/>
</keyboardMap>
<displayMap>
<display mapOutput="\u0300" display="u\u02CB"/>
<display mapOutput="\u0301" display="u\u02CA"/>
</displayMap><br> </keyboard ></pre>
<p>To allow displayMaps to be shared across
descriptions, there is no requirement that @mapOutput matches any @to
in any keyMap/map element in the keyboard description.</p>
<hr>
<h3>
5.12 <a name="Element_layer" href="#Element_layer">Element: layer</a>
</h3>
<p>A layer element describes the configuration of
keys on a particular layer of a keyboard. It contains row elements to
describe which keys exist in each row and also switch elements that
describe how keys in the layer switch the layer to another. In
addition, for platforms that require a mapping from a key to a
virtual key (for example Windows or Mac) there is also a vkeys
element to describe the mapping.</p>
<dl>
<dt>Attribute: modifier (required)</dt>
<dd>This has two roles. It acts as an identifier for the layer
element and also provides the linkage into a keyMap. A modifier is a
single modifier combination such that it is matched by one of the
modifier combinations in one of the keyMap/@modifiers attribute. To
indicate that no modifiers apply the reserved name of "none" is
used. For the purposes of fallback vkey mapping, the following
modifier components are reserved: "shift", "ctrl", "alt", "caps",
"cmd", "opt" along with the "L" and "R" optional single suffixes for
the first 3 in that list. There must be a keyMap whose @modifiers
attribute matches the @modifier attribute of the layer element. It
is an error if there is no such keyMap.</dd>
</dl>
<p>The keymap/@modifier often includes multiple
combinations that match. It is not necessary (or prefered) to include
all of these. Instead a minimal matching element should be used, such
that exactly one keymap is matched.</p>
<p>The following are examples of situations where
the @modifiers and @modifier do not match, with a different keymap
definition than above.</p>
<table>
<!-- nocaption -->
<tbody>
<tr>
<th><p>keyMap/@modifiers</p></th>
<th><p>layer/@modifier</p></th>
</tr>
<tr>
<td><p>shiftL</p></td>
<td><p>shift (ambiguous)</p></td>
</tr>
<tr>
<td><p>altR</p></td>
<td><p>alt</p></td>
</tr>
<tr>
<td><p>shiftL?+shiftR</p></td>
<td><p>shift</p></td>
</tr>
</tbody>
</table>
<p>And these do match:</p>
<table>
<!-- nocaption -->
<tbody>
<tr>
<th><p>keyMap/@modifiers</p></th>
<th><p>layer/@modifier</p></th>
</tr>
<tr>
<td><p>shiftL shiftR</p></td>
<td><p>shift</p></td>
</tr>
</tbody>
</table>
<p>The use of @modifier as an identifier for a
layer, is sufficient since it is always unique among the set of layer
elements in a keyboard.</p>
<hr>
<h3>
5.13 <a name="Element_row" href="#Element_row">Element: row</a>
</h3>
<p>A row element describes the keys that are
present in the row of a keyboard. Row elements are ordered within a
layout element with the top visual row being stored first. The row
element introduces the keyId which may be an ISOKey or a specialKey.
More formally:</p>
<pre> keyId = ISOKey | specialKey<br> ISOKey = [A-Z][0-9][0-9]<br> specialKey = [a-z][a-zA-Z0-9]{2,7}</pre>
<p>
ISOKey denotes a key having an <a href="#Definitions">ISO
Position</a>. SpecialKey is used to identify functional keys occurring
on a virtual keyboard layout.
</p>
<dl>
<dt>Attribute: keys (required)</dt>
<dd>This is a string that lists the keyId for each of the keys
in a row. Key ranges may be contracted to firstkey-lastkey but only
for ISOKey type keyIds. The interpolation between the first and last
keys names is entirely numeric. Thus D00-D03 is equivalent to D00
D01 D02 D03. It is an error if the first and last keys do not have
the same alphabetic prefix or the last key numeric component is less
than or equal to the first key numeric component.</dd>
</dl>
<p>specialKey type keyIds may take any value
within their syntactic constraint. But the following specialKeys are
reserved to allow applications to identify them and give them special
handling:</p>
<ul>
<li>"bksp", "enter", "space", "tab", "esc", "sym", "num"</li>
<li>all the reserved modifier names</li>
<li>specialKeys starting with the letter "x" for future reserved
names.</li>
</ul>
<p>Here is an example of a row element:</p>
<pre> <layer modifier="none">
<row keys="D01-D10"/>
<row keys="C01-C09"/>
<row keys="shift B01-B07 bksp"/>
<row keys="sym A01 smilies A02-A03 enter"/>
</layer>
</pre>
<hr>
<h3>
5.14 <a name="Element_switch" href="#Element_switch">Element:
switch</a>
</h3>
<p>The switch element describes a function key
that has been included in the layout. It specifies which layer
pressing the key switches you to and also what the key looks like.</p>
<dl>
<dt>Attribute: iso (required)</dt>
<dd>The keyId as specified in one of the row elements. This must
be a specialKey and not an ISOKey.</dd>
</dl>
<dl>
<dt>Attribute: layout (required)</dt>
<dd>The modifier attribute of the resulting layout element that
describes the layer the user gets switched to.</dd>
</dl>
<dl>
<dt>Attribute: display (required)</dt>
<dd>A string to be displayed on the key.</dd>
</dl>
<p>Here is an example of a switch element for a
shift key:</p>
<pre> <layer modifier="none">
<row keys="D01-D10"/>
<row keys="C01-C09"/>
<row keys="shift B01-B07 bksp"/>
<row keys="sym A01 smilies A02-A03 enter"/>
<switch iso="shift" layout="shift" display="&#x21EA;"/>
</layer>
<layer modifier="shift">
<row keys="D01-D10"/>
<row keys="C01-C09"/>
<row keys="shift B01-B07 bksp"/>
<row keys="sym A01 smilies A02-A03 enter"/>
<switch iso="shift" layout="none" display="&#x21EA;"/>
</layer></pre>
<hr>
<h3>
5.15 <a name="Element_vkeys" href="#Element_vkeys">Element: vkeys</a>
</h3>
<p>On some architectures, applications may
directly interact with keys before they are converted to characters.
The keys are identified using a virtual key identifier or vkey. The
mapping between a physical keyboard key and a vkey is keyboard-layout
dependent. For example, a French keyboard would identify the D01 key
as being an 'a' with a vkey of 'a' as opposed to 'q' on a US English
keyboard. While vkeys are layout dependent, they are not modifier
dependent. A shifted key always has the same vkey as its unshifted
counterpart. In effect, a key is identified by its vkey and the
modifiers active at the time the key was pressed.</p>
<p>For a physical keyboard there is a layout
specific default mapping of keys to vkeys. These are listed in a
vkeys element which takes a list of vkey element mappings and is
identified by a type. There are different vkey mappings required for
different platforms. While type="windows" vkeys are very similar to
type="osx" vkeys, they are not identical and require their own
mapping.</p>
<p>The most common model for specifying vkeys is
to import a standard mapping, say to the US layout, and then to add a
vkeys element to change the mapping appropriately for the specific
layout.</p>
<p>In addition to describing physical keyboards,
vkeys also get used in virtual keyboards. Here the vkey mapping is
local to a layer and therefore a vkeys element may occur within a
layout element. In the case where a layout element has no vkeys
element then the resulting mapping may either be empty (none of the
keys represent keys that have vkey identifiers) or may fallback to
the layout wide vkeys mapping. Fallback only occurs if the layout's
modifier attribute consists only of standard modifiers as listed as
being reserved in the description of the layout/@modifier attribute,
and if the modifiers are standard for the platform involved. So for
Windows, 'cmd' is a reserved modifier but it is not standard for
Windows. Therefore on Windows the vkey mapping for a layout with
@modifier="cmd" would be empty.</p>
<p>A vkeys element consists of a list of vkey
elements.</p>
<hr>
<h3>
5.16 <a name="Element_vkey" href="#Element_vkey">Element: vkey</a>
</h3>
<p>A vkey element describes a mapping between a
key and a vkey for a particular platform.</p>
<dl>
<dt>Attribute: iso (required)</dt>
<dd>The ISOkey being mapped.</dd>
</dl>
<dl>
<dt>Attribute: type</dt>
<dd>Current values: android, chromeos, osx, und, windows.</dd>
</dl>
<dl>
<dt>Attribute: vkey (required)</dt>
<dd>The resultant vkey identifier.</dd>
</dl>
<dl>
<dt>Attribute: modifier</dt>
<dd>This attribute may only be used if the parent vkeys element
is a child of a layout element. If present it allows an unmodified
key from a layer to represent a modified virtual key.</dd>
</dl>
<p>This example shows some of the mappings for a
French keyboard layout:</p>
<pre> <i>shared/win-vkey.xml</i>
<keyboard>
<vkeys type="windows">
<vkey iso="D01" vkey="VK_Q"/>
<vkey iso="D02" vkey="VK_W"/>
<vkey iso="C01" vkey="VK_A"/>
<vkey iso="B01" vkey="VK_Z"/>
</vkeys>
</keyboard><br>
<i>shared/win-fr.xml</i>
<keyboard>
<import path="shared/win-vkey.xml">
<keyMap>
<map iso="D01" to="a"/>
<map iso="D02" to="z"/>
<map iso="C01" to="q"/>
<map iso="B01" to="w"/>
</keyMap><br>
<keyMap modifiers="shift">
<map iso="D01" to="A"/>
<map iso="D02" to="Z"/>
<map iso="C01" to="Q"/>
<map iso="B01" to="W"/>
</keyMap><br>
<vkeys type="windows">
<vkey iso="D01" vkey="VK_A"/>
<vkey iso="D02" vkey="VK_Z"/>
<vkey iso="C01" vkey="VK_Q"/>
<vkey iso="B01" vkey="VK_W"/>
</vkeys>
</keyboard></pre>
<p>In the context of a virtual keyboard there
might be a symbol layer with the following layout:</p>
<pre> <keyboard>
<keyMap>
<map iso="D01" to="1"/>
<map iso="D02" to="2"/>
...
<map iso="D09" to="9"/>
<map iso="D10" to="0"/>
<map iso="C01" to="!"/>
<map iso="C02" to="@"/>
...
<map iso="C09" to="("/>
<map iso="C10" to=")"/>
</keyMap><br>
<layer modifier="sym">
<row keys="D01-D10"/>
<row keys="C01-C09"/>
<row keys="shift B01-B07 bksp"/>
<row keys="sym A00-A03 enter"/>
<switch iso="sym" layout="none" display="ABC"/>
<switch iso="shift" layout="sym+shift" display="&=/<"/>
<vkeys type="windows">
<vkey iso="D01" vkey="VK_1"/>
...
<vkey iso="D10" vkey="VK_0"/>
<vkey iso="C01" vkey="VK_1" modifier="shift"/>
...
<vkey iso="C10" vkey="VK_0" modifier="shift"/>
</vkeys>
</layer>
</keyboard></pre>
<hr>
<h3>
5.17 <a
name="Element_transforms" href="#Element_transforms">Element:
transforms</a>
</h3>
<p>This element defines a group of one or more transform elements
associated with this keyboard layout. This is used to support
such as dead-keys using a straightforward structure that works for all the
keyboards tested, and that results in readable source data.</p>
<p>
There can be multiple <transforms> elements</p>
<p>Syntax</p>
<p><transforms type="..."></p>
<p>{a set of transform elements}</p>
<p></transforms></p>
<dl>
<dt>Attribute: type (required)</dt>
<dd>Current values: simple, final.</dd>
</dl>
<hr>
<h3>
5.18 <a
name="Element_transform" href="#Element_transform">Element:
transform</a>
</h3>
<p>
This element must have the transforms element as its parent. This
element represents a single transform that may be performed using the
keyboard layout. A transform is an
element that specifies a set of conversions from sequences of code
points into one (or more) other code points.. For example, in most
French keyboards hitting the "^" dead-key followed by the "e" key
produces "ê".
</p>
<p>Syntax</p>
<p><transform from="{combination of characters}"
to="{output}"></p>
<dl>
<dt>Attribute: from (required)</dt>
<dd>
The from attribute consists of a sequence of elements. Each element
matches one character and may consist of a codepoint or a UnicodeSet
(both as defined in <a
href="http://www.unicode.org/reports/tr35/#Unicode_Sets">UTS#35
section 5.3.3</a>).
</dd>
</dl>
<p>For example, suppose there are the following transforms:</p>
<blockquote>
<p>^e → ê</p>
<p>^a → â</p>
<p>^o → ô</p>
</blockquote>
<p>If the user types a key that produces "^", the keyboard enters
a dead state. When the user then types a key that produces an "e",
the transform is invoked, and "ê" is output. Suppose a user presses
keys producing "^" then "u". In this case, there is no match for the
"^u", and the "^" is output if the failure attribute in the transform
element is set to emit. If there is no transform starting with "u",
then it is also output (again only if failure is set to emit) and the
mechanism leaves the "dead" state.</p>
<p>The UI may show an initial sequence of matching characters with
a special format, as is done with dead-keys on the Mac, and modify
them as the transform completes. This behavior is specified in the
partial attribute in the transform element.</p>
<p>Most transforms in practice have only a couple of characters.
But for completeness, the behavior is defined on all strings:</p>
<ol>
<li>If there could be a longer match if the user were to type
additional keys, go into a 'dead' state.</li>
<li>If there could not be a longer match, find the longest
actual match, emit the transformed text (if failure is set to emit),
and start processing again with the remainder.</li>
<li>If there is no possible match, output the first character,
and start processing again with the remainder.</li>
</ol>
<p>Suppose that there is the following transforms:</p>
<blockquote>
<p>ab → x</p>
<p>abc → y</p>
<p>abef → z</p>
<p>bc → m</p>
<p>beq → n</p>
</blockquote>
<p>Here's what happens when the user types various sequence
characters:</p>
<table>
<!-- nocaption -->
<tbody>
<tr>
<td><p>Input characters</p></td>
<td><p>Result</p></td>
<td><p>Comments</p></td>
</tr>
<tr>
<td><p>ab</p></td>
<td> </td>
<td><p>No output, since there is a longer transform with
this as prefix.</p></td>
</tr>
<tr>
<td><p>abc</p></td>
<td><p>y</p></td>
<td><p>Complete transform match.</p></td>
</tr>
<tr>
<td><p>abd</p></td>
<td><p>xd</p></td>
<td><p>The longest match is "ab", so that is converted and
output. The 'd' follows, since it is not the start of any
transform.</p></td>
</tr>
<tr>
<td><p>abeq</p></td>
<td><p>xeq</p></td>
<td><p>"ab" wins over "beq", since it comes first. That
is, there is no longer possible match starting with 'a'.</p></td>
</tr>
<tr>
<td><p>bc</p></td>
<td><p>m</p></td>
<td> </td>
</tr>
</tbody>
</table>
<p>Control characters, combining marks and whitespace in this
attribute are escaped using the \u{...} notation.</p>
<dl>
<dt>Attribute: to (required)</dt>
<dd>
This attribute represents the characters that are output from the
transform. The output can contain more than one
character, so you could have <transform from="´A"
to="Fred"/>
</dd>
</dl>
<p>Control characters, whitespace (other than the regular space
character) and combining marks in this attribute are escaped using
the \u{...} notation.</p>
<p>Examples</p>
<pre><keyboard locale="fr-CA-t-k0-CSA-osx"><br> <transforms type="simple"><br> <transform from="´a" to="á" /><br> <transform from="´A" to="Á" /><br> <transform from="´e" to="é" /><br> <transform from="´E" to="É" /><br> <transform from="´i" to="í" /><br> <transform from="´I" to="Í" /><br> <transform from="´o" to="ó" /><br> <transform from="´O" to="Ó" /><br> <transform from="´u" to="ú" /><br> <transform from="´U" to="Ú" /><br> </transforms><br> ...<br></keyboard><br><keyboard locale="nl-BE-t-k0-chromeos"><br> <transforms type="simple"><br> <transform from="\u{30c}a" to="ǎ" /> <!-- ̌a → ǎ --><br> <transform from="\u{30c}A" to="Ǎ" /> <!-- ̌A → Ǎ --><br> <transform from="\u{30a}a" to="å" /> <!-- ̊a → å --><br> <transform from="\u{30a}A" to="Å" /> <!-- ̊A → Å --><br> </transforms><br> ...<br></keyboard></pre>
<dl>
<dt>Attribute: before (optional)</dt>
<dd>This attribute consists of a sequence of elements (codepoint
or UnicodeSet) to match the text up to the current position in the
text (this is similar to a regex "look behind" assertion:
(?<=a)b matches a "b" that is preceded by an
"a"). The attribute must match for the transform to apply.
If missing, no before constraint is applied. The attribute value
must not be empty.</dd>
</dl>
<dl>
<dt>Attribute: after (optional)</dt>
<dd>This attribute consists of a sequence of elements (codepoint
or UnicodeSet) and matches as a zero-width assertion after the @from
sequence. The attribute must match for the transform to apply. If
missing, no after constraint is applied. The attribute value must
not be empty. When the transform is applied, the string matched by
the @from attribute is replaced by the string in the @to attribute,
with the text matched by the @after attribute left unchanged. After
the change, the current position is reset to just after the text
output from the @to attribute and just before the text matched by
the @after attribute. Warning: some legacy implementations may not
be able to make such an adjustment and will place the current
position after the @after matched string.</dd>
</dl>
<dl>
<dt>Attribute: error (optional)</dt>
<dd>If set this attribute indicates that the keyboarding
application may indicate an error to the user in some way.
Processing may stop and rewind to any state before the key was
pressed. If processing does stop, no further transforms on the same
input are applied. The @error attribute takes the value "fail", or
must be absent. If processing continues, the @to is used for output
as normal. It thus should contain a reasonable value.</dd>
</dl>
<p>For example:</p>
<blockquote><transform
from="\u037A\u037A" to="\u037A"
error="fail" /></blockquote>
<p>This indicates that it is an error to type two
iota subscripts immediately after each other.</p>
<p>In terms of how these different attributes work
in processing a sequences of transforms, consider the transform:</p>
<blockquote><transform
before="X" from="Y" after="Y"
to="B"/></blockquote>
<p>This would transform the string:</p>
<blockquote>XYZ → XBZ</blockquote>
<p>If we mark where the current match position is
before and after the transform we see:</p>
<blockquote>X | Y Z → X B | Z</blockquote>
<p>And a subsequent transform could transform the
Z string, looking back (using @before) to match the B.</p>
<p>There are other keying behaviors that are
needed particularly in handling languages and scripts from various
parts of the world. The behaviors intended to be covered by the
transforms are:</p>
<ul>
<li>Reordering combining marks. The order required for
underlying storage may differ considerably from the desired typing
order. In addition, a keyboard may want to allow for different
typing orders.</li>
<li>Error indication. Sometimes a keyboard layout will want to
specify to the application that a particular keying sequence in a
context is in error and that the application should indicate that
that particular keypress is erroneous.</li>
<li>Backspace handling. There are various approaches to handling
the backspace key. An application may treat it as an undo of the
last key input, or it may simply delete the last character in the
currently output text, or it may use transform rules to tell it how
much to delete.</li>
</ul>
<p>We consider each transform type in turn and
consider attributes to the <transforms> element pertinent to
that type.</p>
<hr>
<h3>
5.19 <a name="Element_reorder" href="#Element_reorder">Element:
reorder</a>
</h3>
<p>The reorder transform is applied after all
transform except for those with type=“final”.</p>
<p>This transform has the job of reordering
sequences of characters that have been typed, from their typed order
to the desired output order. The primary concern in this transform is
to sort combining marks into their correct relative order after a
base, as described in this section. The reorder transforms can be
quite complex, keyboard layouts will almost always import them.</p>
<p>The reordering algorithm consists of four
parts:</p>
<ol>
<li>Create a sort key for each character in the input string. A
sort key has 4 parts: (primary, index, tertiary).
<ul>
<li>The <b>primary weight</b> is the primary order value.
</li>
<li>The <b>secondary weight</b> is the index, a position in
the input string, usually of the character itself, but it may be
of a character earlier in the string.
</li>
<li>The <b>tertiary weight</b> is a tertiary order value
(defaulting to 0).
</li>
<li>The <b>quaternary weight</b> is the index of the character
in the string. This ensures a stable sort for sequences of
characters with the same tertiary weight.
</li>
</ul>
</li>
<li>Mark each character as to whether it is a prebase character,
one that is typed before the base and logically stored after. Thus
it will have a primary order > 0.</li>
<li>Use the sort key and the prebase mark to identify runs. A
run starts with a prefix that contains any prebase characters and a
single base character whose primary and tertiary key is 0. The run
extends until, but not including, the start of the prefix of the
next run or end of the string.
<ul>
<li>run := prebase* (primary=0 && tertiary=0) ((primary≠0 ||
tertiary≠0) && !prebase)*</li>
</ul>
</li>
<li>Sort the character order of each character in the run based
on its sort key.</li>
</ol>
<p>The primary order of a character with the
Unicode property Combining_Character_Class (ccc) of 0 may well not be
0. In addition, a character may receive a different primary order
dependent on context. For example, in the Devanagari sequence ka
halant ka, the first ka would have a primary order 0 while the halant
ka sequence would give both halant and the second ka a primary order
> 0, for example 2. Note that “base” character in this discussion
is not a Unicode base character. It is instead a character with
primary=0.</p>
<p>In order to get the characters into the correct
relative order, it is necessary not only to order combining marks
relative to the base character, but also to order some combining
marks in a subsequence following another combining mark. For example
in Devanagari, a nukta may follow consonant character, but it may
also follow a conjunct consisting of a consonant, halant, consonant.
Notice that the second consonant is not, in this model, the start of
a new run because some characters may need to be reordered to before
the first base, for example repha. The repha would get primary <
0, and be sorted before the character with order = 0, which is, in
the case of Devanagari, the initial consonant of the orthographic
syllable.</p>
<p>The reorder transform consists of a single
element type: <reorder> encapsulated in a <reorders>
element. Each is a rule that matches against a string of characters
with the action of setting the various ordering attributes (primary,
tertiary, tertiary_base, prebase) for the matched characters in the
string.</p>
<blockquote>
<p>
<strong>from</strong> This attribute follows the transform/@from
attribute and contains a string of elements. Each element matches
one character and may consist of a codepoint or a UnicodeSet (both
as defined in UTS#35 section 5.3.3). This attribute is required.
</p>
<p>
<strong>before</strong> This attribute follows the transform/@before
attribute and contains the element string that must match the string
immediately preceding the start of the string that the @from
matches.
</p>
<p>
<strong>after</strong> This attribute follows the transform/@after
attribute and contains the element string that must match the string
immediately following the end of the string that the @from matches.
</p>
<p>
<strong>order</strong> This attribute gives the primary order for
the elements in the matched string in the @from attribute. The value
is a simple integer between -128 and +127 inclusive, or a space
separated list of such integers. For a single integer, it is applied
to all the elements in the matched string. Details of such list type
attributes are given after all the attributes are described. If
missing, the order value of all the matched characters is 0. We
consider the order value for a matched character in the string.
</p>
<ul>
<li>If the value is 0 and its tertiary value is
0, then the character is the base of a new run.</li>
<li>If the value is 0 and its tertiary value is
non-zero, then it is a normal character in a run, with ordering
semantics as described in the @tertiary attribute.</li>
<li>If the value is negative, then the
character is a primary character and will reorder to be before the
base of the run.</li>
<li>If the value is positive, then the
character is a primary character and is sorted based on the order
value as the primary key following a previous base character.</li>
</ul>
<p>A character with a zero tertiary value is a
primary character and receives a sort key consisting of:</p>
<ul>
<li>Primary weight is the order value</li>
<li>Secondary weight is the index of the
character. This may be any value (character index, codepoint index)
such that its value is greater than the character before it and
less than the character after it.</li>
<li>Tertiary weight is 0.</li>
<li>Quaternary weight is the same as the
secondary weight.</li>
</ul>
<p>
<strong>tertiary</strong> This attribute gives the tertiary order
value to the characters matched. The value is a simple integer
between -128 and +127 inclusive, or a space separated list of such
integers. If missing, the value for all the characters matched is 0.
We consider the tertiary value for a matched character in the
string.
</p>
<ul>
<li>If the value is 0 then the character is
considered to have a primary order as specified in its order value
and is a primary character.</li>
<li>If the value is non zero, then the order
value must be zero otherwise it is an error. The character is
considered as a tertiary character for the purposes of ordering.</li>
</ul>
<p>A tertiary character receives its primary
order and index from a previous character, which it is intended to
sort closely after. The sort key for a tertiary character consists
of:</p>
<ul>
<li>Primary weight is the primary weight of the
primary character</li>
<li>Secondary weight is the index of the
primary character, not the tertiary character</li>
<li>Tertiary weight is the tertiary value for
the character.</li>
<li>Quaternary weight is the index of the
tertiary character.</li>
</ul>
<p>
<strong>tertiary_base</strong> This attribute is a space separated
list of "true" or "false" values corresponding
to each character matched. It is illegal for a tertiary character to
have a true tertiary_base value. For a primary character it marks
that this character may have tertiary characters moved after it.
When calculating the secondary weight for a tertiary character, the
most recently encountered primary character with a true
tertiary_base attribute is used. Primary characters with an @order
value of 0 automatically are treated as having tertiary_base true
regardless of what is specified for them.
</p>
<p>
<strong>prebase</strong> This attribute gives the prebase attribute
for each character matched. The value may be "true" or
"false" or a space separated list of such values. If
missing the value for all the characters matched is false. It is
illegal for a tertiary character to have a true prebase value.
</p>
<p>If a primary character has a true prebase
value then the character is marked as being typed before the base
character of a run, even though it is intended to be stored after
it. The primary order gives the intended position in the order after
the base character, that the prebase character will end up. Thus
@primary may not be 0. These characters are part of the run prefix.
If such characters are typed then, in order to give the run a base
character after which characters can be sorted, an appropriate base
character, such as a dotted circle, is inserted into the output run,
until a real base character has been typed. A value of
"false" indicates that the character is not a prebase.</p>
</blockquote>
<p>There is no @error attribute.</p>
<p>For @from attributes with a match string length
greater than 1, the sort key information (@order, @tertiary,
@tertiary_base, @prebase) may consist of a space separated list of
values, one for each element matched. The last value is repeated to
fill out any missing values. Such a list may not contain more values
than there are elements in the @from attribute:</p>
<pre> if len(@from) < len(@list) then error<br> else
while len(@from) > len(@list)<br> append lastitem(@list) to @list<br> endwhile
endif</pre>
<p>For example, consider the word Northern Thai
(nod-Lana) word: ᨡ᩠ᩅᩫ᩶ 'roasted'. This is ideally encoded as the
following:</p>
<table class='simple'>
<tr>
<th>name</th>
<td><em>ka</em></td>
<td><em>asat</em></td>
<td><em>wa</em></td>
<td><em>o</em><em></em></td>
<td><em>t2</em></td>
</tr>
<tr>
<th>code</th>
<td>1A21</td>
<td>1A60</td>
<td>1A45</td>
<td>1A6B<em></em></td>
<td>1A76</td>
</tr>
<tr>
<th>ccc</th>
<td>0</td>
<td>9</td>
<td>0</td>
<td>0<em></em></td>
<td>230</td>
</tr>
</table>
<p>(That sequence is already in NFC format.)</p>
<p>Some users may type the upper component of the
vowel first, and the tone before or after the lower component. Thus
someone might type it as:</p>
<table class='simple'>
<tr>
<th>name</th>
<td><em>ka</em></td>
<td><em>o</em><em></em></td>
<td><em>t2</em></td>
<td><em>asat</em></td>
<td><em>wa</em></td>
</tr>
<tr>
<th>code</th>
<td>1A21</td>
<td>1A6B<em></em></td>
<td>1A76</td>
<td>1A60</td>
<td>1A45</td>
</tr>
<tr>
<th>ccc</th>
<td>0</td>
<td>0<em></em></td>
<td>230</td>
<td>9</td>
<td>0</td>
</tr>
</table>
<p>The Unicode NFC format of that typed value
reorders to:</p>
<table class='simple'>
<tr>
<th>name</th>
<td><em>ka</em></td>
<td><em>o</em><em></em></td>
<td><em>asat</em></td>
<td><em>t2</em></td>
<td><em>wa</em></td>
</tr>
<tr>
<th>code</th>
<td>1A21</td>
<td>1A6B<em></em></td>
<td>1A60</td>
<td>1A76</td>
<td>1A45</td>
</tr>
<tr>
<th>ccc</th>
<td>0</td>
<td>0<em></em></td>
<td>9</td>
<td>230</td>
<td>0</td>
</tr>
</table>
<p>
Finally, the user might also type in the sequence with the tone <em>after</em>
the lower component.
</p>
<table class='simple'>
<tr>
<th>name</th>
<td><em>ka</em></td>
<td><em>o</em><em></em></td>
<td><em>asat</em></td>
<td><em>wa</em></td>
<td><em>t2</em></td>
</tr>
<tr>
<th>code</th>
<td>1A21</td>
<td>1A6B<em></em></td>
<td>1A60</td>
<td>1A45</td>
<td>1A76</td>
</tr>
<tr>
<th>ccc</th>
<td>0</td>
<td>0<em></em></td>
<td>9</td>
<td>0</td>
<td>230</td>
</tr>
</table>
<p>(That sequence is already in NFC format.)</p>
<p>We want all of these sequences to end up
ordered as the first. To do this, we use the following rules:</p>
<pre> <reorder from="\u1A60" order="127"/> <!-- max possible order -->
<reorder from="\u1A6B" order="42"/>
<reorder from="[\u1A75-\u1A7C]" order="55"/><br> <reorder before="\u1A6B" from="\u1A60\u1A45" order="10"/><br> <reorder before="\u1A6B[\u1A75-\u1A7C]" from="\u1A60\u1A45" order="10"/><br> <reorder before="\u1A6B" from="\u1A60[\u1A75-\u1A7C]\u1A45" order="10 55 10"/></pre>
<p>
The first reorder is the default ordering for the <i>asat</i> which
allows for it to be placed anywhere in a sequence, but moves any
non-consonants that may immediately follow it, back before it in the
sequence. The next two rules give the orders for the top vowel
component and tone marks respectively. The next three rules give the
<i>asat</i> and <i>wa</i> characters a primary order that places them
before the <em>o</em>. Notice particularly the final reorder rule
where the <i>asat</i>+<i>wa</i> is split by the tone mark. This rule
is necessary in case someone types into the middle of previously
normalized text.
</p>
<p><reorder> elements are priority ordered
based first on the length of string their @from attribute matches and
then the sum of the lengths of the strings their @before and @after
attributes match.</p>
<p>If a layout has two <transforms> elements
of type reorder, e.g. from importing one and specifying the second,
then <transform> elements are merged. The @from string in a
<reorder> element describes a set of strings that it matches.
This also holds for the @before and @after attributes. The
intersection of two <reorder> elements consists of the
intersections of their @from, @before and @after string sets. It is
illegal for the intersection between any two <reorder> elements
in the same <transforms> element to be non empty, although
implementors are encouraged to have pity on layout authors when
reporting such errors, since they can be hard to track down.</p>
<p>If two <reorder> elements in two
different <transforms> elements have a non empty intersection,
then they are split and merged. They are split such that where there
were two <reorder> elements, there are, in effect (but not
actuality), three elements consisting of:</p>
<ul>
<li>@from, @before, @after that match the
intersection of the two rules. The other attributes are merged, as
described below.</li>
<li>@from, @before, @after that match the set of
strings in the first rule not in the intersection with the other
attributes from the first rule.</li>
<li>@from, @before, @after that match the set of
strings in the second rule not in the intersection, with the other
attributes from the second rule.</li>
</ul>
<p>When merging the other attributes, the second
rule is taken to have priority (occurring later in the layout
description file). Where the second rule does not define the value
for a character but the first does, it is taken from the first rule,
otherwise it is taken from the second rule.</p>
<p>Notice that it is possible for two rules to
match the same string, but for them not to merge because the
distribution of the string across @before, @from, and @after is
different. For example:</p>
<pre> <reorder before="ab" from="cd" after="e"/></pre>
<p>would not merge with:</p>
<pre> <reorder before="a" from="bcd" after="e"/></pre>
<p>When two <reorders> elements merge as the
result of an import, the resulting reorder elements are sorted into
priority order for matching.</p>
<p>Consider this fragment from a shared reordering
for the Myanmar script:</p>
<pre><!-- medial-r -->
<reorder from="\u103C" order="20"/>
<!-- [medial-wa or shan-medial-wa] -->
<reorder from="[\u103D\u1082]" order="25"/>
<!-- [medial-ha or shan-medial-wa]+asat = Mon <i>asat</i> --><br> <reorder from="[\u103E\u1082]\u103A" order="27"/>
<!-- [medial-ha or mon-medial-wa] --><br> <reorder from="[\u103E\u1060]" order="27"/>
<!-- [e-vowel or shan-e-vowel] --><br> <reorder from="[\u1031\u1084]" order="30"/>
<br> <reorder from="[\u102D\u102E\u1033-\u1035\u1071-\u1074\u1085\u109D\uA9E5]" order="35"/></pre>
<p>A particular Myanmar keyboard layout can have
this reorders element:</p>
<pre><reorders type="reorder"><br><!-- Kinzi -->
<reorder from="\u1004\u103A\u1039" order="-1"/>
<!-- e-vowel -->
<reorder from="\u1031" prebase="1"/>
<!-- medial-r -->
<reorder from="\u103C" prebase="1"/><br></reorders></pre>
<p>The effect of this that the <em>e-vowel</em> will be identified as a prebase and will have an order of 30.
Likewise a <em>medial-r</em> will be identified as a prebase and will have an
order of 20. Notice that a <em>shan-e-vowel</em> will not be identified as a prebase
(even if it should be!). The <em>kinzi</em> is described in the layout since
it moves something across a run boundary. By separating such
movements (prebase or moving to in front of a base) from the shared
ordering rules, the shared ordering rules become a self-contained
combining order description that can be used in other keyboards or
even in other contexts than keyboarding. </p>
<hr>
<h3>
5.20 <a name="Element_final" href="#Element_final">Element: final</a>
</h3>
<p>The final transform is applied after the
reorder transform. It executes in a similar way to the simple
transform with the settings ignored, as if there were no settings in
the <settings> element.</p>
<p>This is an example from Khmer where split
vowels are combined after reordering.</p>
<pre>
<transforms type="final">
<transform from="\u17C1\u17B8" to="\u17BE"/>
<transform from="\u17C1\u17B6" to="\u17C4"/>
</transforms></pre>
<p>Another example allows a keyboard
implementation to alert or stop people typing two lower vowels in a
Burmese cluster:</p>
<pre> <transform from="[\u102F\u1030\u1048\u1059][\u102F\u1030\u1048\u1059]" error="fail"/></pre>
<hr>
<h3>
5.21 <a name="Element_backspaces" href="#Element_backspaces">Element:
backspaces</a>
</h3>
<p>The backspace transform is an optional
transform that is not applied on input of normal characters, but is
only used to perform extra backspace modifications to previously
committed text.</p>
<p>Keyboarding applications typically, but are not
required, to work in one of two modes:</p>
<dl>
<dt>
<b>text entry</b>
</dt>
<dd>text entry happens while a user is typing new text. A user
typically wants the backspace key to undo whatever they last typed,
whether or not they typed things in the 'right' order.</dd>
</dl>
<dl>
<dt>
<b>text editing</b>
</dt>
<dd>text editing happens when a user moves the cursor into some
previously entered text which may have been entered by someone else.
As such, there is no way to know in which order things were typed,
but a user will still want appropriate behaviour when they press
backspace. This may involve deleting more than one character or
replacing a sequence of characters with a different sequence.</dd>
</dl>
<p>In the text entry mode, there is no need for
any special description of backspace behaviour. A keyboarding
application will typically keep a history of previous output states
and just revert to the previous state when backspace is hit.</p>
<p>In text editing mode, different keyboard
layouts may behave differently in the same textual context. The
backspace transform allows the keyboard layout to specify the effect
of pressing backspace in a particular textual context. This is done
by specifying a set of backspace rules that match a string before the
cursor and replace it with another string. The rules are expressed as
backspace elements encapsulated in a backspaces element.</p>
<hr>
<h3>
5.22 <a name="Element_backspace" href="#Element_backspace">Element:
backspace</a>
</h3>
<p>The backspace element has the same @before,
@from, @after, @to, @errors of the transform element. The @to is
optional with backspace.</p>
<p>For example, consider deleting a Devanagari
ksha:</p>
<pre>
<backspaces>
<backspace from="\u0915\u094D\u0936"/>
</backspaces></pre>
<p>Here there is no @to attribute since the whole
string is being deleted. This is not uncommon in the backspace
transforms.</p>
<p>A more complex example comes from a Burmese
visually ordered keyboard:</p>
<pre> <backspaces>
<!-- Kinzi --><br> <backspace from="[\u1004\u101B\u105A]\u103A\u1039"/>
<!-- subjoined consonant --><br> <backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021\u1050\u1051\u105A-\u105D]"/>
<br><!-- tone mark -->
<backspace from="\u102B\u103A"/>
<br><!-- Handle prebases -->
<!-- diacritics stored before e-vowel --><br> <backspace from="[\u103A-\u103F\u105E-\u1060\u1082]\u1031" to="\u1031"/>
<!-- diacritics stored before medial r --><br> <backspace from="[\u103A-\u103B\u105E-\u105F]\u103C" to="\u103C"/>
<br><!-- subjoined consonant before e-vowel -->
<backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u1031" to="\u1031"/>
<br><!-- base consonant before e-vowel -->
<backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u1031" to="\uFDDF\u1031"/>
<br><!-- subjoined consonant before medial r -->
<backspace from="\u1039[\u1000-\u101C\u101E\u1020\u1021]\u103C" to="\u103C"/>
<br><!-- base consonant before medial r -->
<backspace from="[\u1000-\u102A\u103F-\u1049\u104E]\u103C" to="\uFDDF\u103C"/>
<br><!-- delete lone medial r or e-vowel -->
<backspace from="\uFDDF[\u1031\u103C]"/><br></backspaces></pre>
<p>The above example is simplified, and doesn't fully handle the interaction between medial-r and e-vowel.</p>
<p>The character \uFDDF does not represent a
literal character, but is instead a special placeholder, a
"filler string". When a keyboard implementation handles a
user pressing a key that inserts a prebase character, it also has to
insert a special filler string before the prebase to ensure that the
prebase character does not combine with the previous cluster. See the
reorder transform for details. The precise filler string is
implementation dependent. Rather than requiring keyboard layout
designers to know what the filler string is, we reserve a special
character that the keyboard layout designer may use to reference this
filler string. It is up to the keyboard implementation to, in effect,
replace that character with the filler string.</p>
<p>The first three transforms above delete various
ligatures with a single keypress. The other transforms handle prebase
characters. There are two in this Burmese keyboard. The transforms
delete the characters preceding the prebase character up to base
which gets replaced with the prebase filler string, which represents
a null base. Finally the prebase filler string + prebase is deleted
as a unit.</p>
<p>The backspace transform is much like other
transforms except in its processing model. If we consider the same
transform as in the simple transform example, but as a backspace:</p>
<blockquote><backspace
before="X" from="Y" after="Z"
to="B"/></blockquote>
<p>This would transform the string:</p>
<blockquote>XYZ → XBZ</blockquote>
<p>If we mark where the current match position is
before and after the transform we see:</p>
<blockquote>X Y | Z → X B | Z</blockquote>
<p>Whereas a simple or final transform would then
run other transforms in the transform list, advancing the processing
position until it gets to the end of the string, the backspace
transform only matches a single backspace rule and then finishes.</p>
<hr>
<h2>
6 <a name="Element_Heirarchy_Platform_File"
href="#Element_Heirarchy_Platform_File">Element Hierarchy -
Platform File</a>
</h2>
<p>There is a separate XML structure for platform-specific
configuration elements. The most notable component is a mapping
between the hardware key codes to the ISO layout positions for that
platform.</p>
<h3>
6.1 <a name="Element_platform" href="#Element_platform">Element:
platform</a>
</h3>
<p>This is the top level element. This element contains a set of
elements defined below. A document shall only contain a single
instance of this element.</p>
<p>Syntax</p>
<p><platform></p>
<p>{platform-specific elements}</p>
<p></platform></p>
<h3>
6.2 <a name="Element_hardwareMap" href="#Element_hardwareMap">Element:
hardwareMap</a>
</h3>
<p>This element must have a platform element as its parent. This
element contains a set of map elements defined below. A document
shall only contain a single instance of this element.</p>
<p>Syntax</p>
<pre><platform>
<hardwareMap>
{a set of map elements}
</hardwareMap>
</platform></pre>
<h3>
6.3 <a name="Element_hardwareMap_map" href="#Element_hardwareMap_map">Element:
map</a>
</h3>
<p>This element must have a hardwareMap element as its parent.
This element maps between a hardware keycode and the corresponding
ISO layout position of the key.</p>
<p>Syntax</p>
<p><map keycode="{hardware keycode}" iso="{ISO
layout position}"/></p>
<dl>
<dt>Attribute: keycode (required)</dt>
<dd>The hardware key code value of the key. This value is an
integer which is provided by the keyboard driver.</dd>
</dl>
<dl>
<dt>Attribute: iso (required)</dt>
<dd>The corresponding position of a key using the ISO layout
convention where rows are identified by letters and columns are
identified by numbers. For example, "D01" corresponds to the "Q" key
on a US keyboard. (See the definition at the beginning of the
document for a diagram).</dd>
</dl>
<p>Examples</p>
<pre><platform><br> <hardwareMap><br> <map keycode="2" iso="E01" /><br> <map keycode="3" iso="E02" /><br> <map keycode="4" iso="E03" /><br> <map keycode="5" iso="E04" /><br> <map keycode="6" iso="E05" /><br> <map keycode="7" iso="E06" /><br> <map keycode="41" iso="E00" /><br> </hardwareMap><br></platform></pre>
<h2>
7 <a name="Invariants" href="#Invariants">Invariants</a>
</h2>
<p>Beyond what the DTD imposes, certain other restrictions on the
data are imposed on the data.</p>
<ol>
<li>For a given platform, every map[@iso] value must be in the
hardwareMap if there is one (_keycodes.xml)</li>
<li>Every map[@base] value must also be in base[@base] value</li>
<li>No keyMap[@modifiers] value can overlap with another
keyMap[@modifiers] value.
<ul>
<li>eg you can't have "RAlt Ctrl" in one keyMap, and "Alt
Shift" in another (because Alt = RAltLAlt).</li>
</ul>
</li>
<li>Every sequence of characters in a transform[@from] value
must be a concatenation of two or more map[@to] values.
<ul>
<li>eg with <transform from="xyz" to="q"> there must be
some map values to get there, such as <map... to="xy"> &
<map... to="z"></li>
</ul>
</li>
<li>There must be either 0 or 1 of (keyMap[@fallback] or
baseMap[@fallback]) attributes</li>
<li>If the base and chars values for modifiers="" are all
identical, and there are no longpresses, that keyMap must not appear
(??)</li>
<li>There will never be overlaps among modifier values.</li>
<li>A modifier set will never have ? (optional) on all values
<ul>
<li>eg, you'll never have RCtrl?Caps?LShift?</li>
</ul>
</li>
<li>Every base[@base] value must be unique.</li>
<li>A modifier attribute value will aways be minimal, observing
the following simplification rules. <br>
</li>
</ol>
<table>
<!-- nocaption -->
<tbody>
<tr>
<td><p>Notation</p></td>
<td><p>Notes</p></td>
</tr>
<tr>
<td><p>
Lower case character (eg. <i>x</i> )
</p></td>
<td><p>
Interpreted as any combination of modifiers.<br> (eg. <i>x</i>
= CtrlShiftOption)
</p></td>
</tr>
<tr>
<td><p>
Upper-case character (eg. <i>Y </i>)
</p></td>
<td><p>
Interpreted as a single modifier key (which may or may not have a
L and R variant)<br> (eg. <i>Y</i> = Ctrl, <i>RY</i> =
RCtrl, etc..)
</p></td>
</tr>
<tr>
<td><p>Y? ⇔ Y ∨ ∅</p>
<p>Y ⇔ LY ∨ RY ∨ LYRY</p></td>
<td><p>
Eg. Opt? ⇔ ROpt ∨ LOpt ∨ ROptLOpt ∨ ∅<br> Eg. Opt ⇔ ROpt ∨
LOpt ∨ ROptLOpt
</p></td>
</tr>
</tbody>
</table>
<table>
<!-- nocaption -->
<tbody>
<tr>
<td><p>Axiom</p></td>
<td><p>Example</p></td>
</tr>
<tr>
<td><p>xY ∨ x ⇒ xY?</p></td>
<td><p>OptCtrlShift OptCtrl → OptCtrlShift?</p></td>
</tr>
<tr>
<td><p>xRY ∨ xY? ⇒ xY?</p>
<p>xLY ∨ xY? ⇒ xY?</p></td>
<td><p>OptCtrlRShift OptCtrlShift? → OptCtrlShift?</p></td>
</tr>
<tr>
<td><p>xRY? ∨ xY ⇒ xY?</p>
<p>xLY? ∨ xY ⇒ xY?</p></td>
<td><p>OptCtrlRShift? OptCtrlShift → OptCtrlShift?</p></td>
</tr>
<tr>
<td><p>xRY? ∨ xY? ⇒ xY?</p>
<p>xLY? ∨ xY? ⇒ xY?</p></td>
<td><p>OptCtrlRShift? OptCtrlShift? → OptCtrlShift?</p></td>
</tr>
<tr>
<td><p>xRY ∨ xY ⇒ xY</p>
<p>xLY ∨ xY ⇒ xY</p></td>
<td><p>OptCtrlRShift OptCtrlShift → OptCtrlShift?</p></td>
</tr>
<tr>
<td><p>LY?RY?</p></td>
<td><p>OptRCtrl?LCtrl? → OptCtrl?</p></td>
</tr>
<tr>
<td><p>xLY? ⋁ xLY ⇒ xLY?</p></td>
<td> </td>
</tr>
<tr>
<td><p>xY? ⋁ xY ⇒ xY?</p></td>
<td> </td>
</tr>
<tr>
<td><p>xY? ⋁ x ⇒ xY?</p></td>
<td> </td>
</tr>
<tr>
<td><p>xLY? ⋁ x ⇒ xLY?</p></td>
<td> </td>
</tr>
<tr>
<td><p>xLY ⋁ x ⇒ xLY?</p></td>
<td> </td>
</tr>
</tbody>
</table>
<h2>
8 <a name="Data_Sources" href="#Data_Sources">Data Sources</a>
</h2>
<p>Here is a list of the data sources used to generate the initial
key map layouts:</p>
<table>
<caption>
<a name="Key_Map_Data_Sources" href="#Key_Map_Data_Sources">Key
Map Data Sources</a>
</caption>
<tbody>
<tr>
<td><p>Platform</p></td>
<td><p>Source</p></td>
<td><p>Notes</p></td>
</tr>
<tr>
<td><p>Android</p></td>
<td><p>
Android 4.0 - Ice Cream Sandwich<br> (<a
href="http://source.android.com/source/downloading.html">http://source.android.com/source/downloading.html</a>)
</p></td>
<td><p>Parsed layout files located in
packages/inputmethods/LatinIME/java/res</p></td>
</tr>
<tr>
<td><p>ChromeOS</p></td>
<td><p>
XKB (<a href="http://www.x.org/wiki/XKB">http://www.x.org/wiki/XKB</a>)
</p></td>
<td><p>The ChromeOS represents a very small subset of the
keyboards available from XKB.</p></td>
</tr>
<tr>
<td><p>Mac OSX</p></td>
<td><p>
Ukelele bundled System Keyboards (<a
href="http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=ukelele">http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=ukelele</a>)
</p></td>
<td><p>These layouts date from Mac OSX 10.4 and are
therefore a bit outdated</p></td>
</tr>
<tr>
<td><p>Windows</p></td>
<td><p>
Generated .klc files from the Microsoft Keyboard Layout Creator (<a
href="http://msdn.microsoft.com/en-us/goglobal/bb964665">http://msdn.microsoft.com/en-us/goglobal/bb964665</a>)
</p></td>
<td><p>
For interactive layouts, see also <a
href="http://msdn.microsoft.com/en-us/goglobal/bb964651">http://msdn.microsoft.com/en-us/goglobal/bb964651</a>
</p></td>
</tr>
</tbody>
</table>
<h2>
9 <a name="Keyboard_IDs" href="#Keyboard_IDs">Keyboard IDs</a>
</h2>
<p>There is a set of subtags that help identify the keyboards.
Each of these are used after the "t-k0" subtags to help identify the
keyboards. The first tag appended is a mandatory platform tag
followed by zero or more tags that help differentiate the keyboard
from others with the same locale code.</p>
<h3>
9.1 <a name="Principles_for_Keyboard_Ids"
href="#Principles_for_Keyboard_Ids">Principles for Keyboard Ids</a>
</h3>
<p>The following are the design principles for the ids.</p>
<ol>
<li>BCP47 compliant.
<ol>
<li>Eg, "en-t-k0-extended".</li>
</ol>
</li>
<li>Use the minimal language id based on likelySubtags.
<ol>
<li>Eg, instead of en-US-t-k0-xxx, use en-t-k0-xxx. Because
there is <likelySubtag from="en"
to="en_Latn_US"/>, en-US → en.</li>
<li>The data is in <a
href="http://unicode.org/repos/cldr/tags/latest/common/supplemental/likelySubtags.xml">http://unicode.org/repos/cldr/tags/latest/common/supplemental/likelySubtags.xml</a></li>
</ol>
</li>
<li>The platform goes first, if it exists. If a keyboard on the
platform changes over time, both are dated, eg
bg-t-k0-chromeos-2011. When selecting, if there is no date, it means
the latest one.</li>
<li>Keyboards are only tagged that differ from the "standard for
each platform". That is, for each language on a platform, there will
be a keyboard with no subtags other than the platform.Subtags with a
common semantics across platforms are used, such as '-extended',
-phonetic, -qwerty, -qwertz, -azerty, …</li>
<li>In order to get to 8 letters, abbreviations are reused that
are already in <a
href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/">bcp47</a>
-u/-t extensions and in <a
href="http://www.iana.org/assignments/language-subtag-registry">language-subtag-registry</a>
variants, eg for Traditional use "-trad" or "-traditio" (both exist
in <a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/">bcp47</a>).
</li>
<li>Multiple languages cannot be indicated, so the predominant
target is used.
<ol>
<li>For Finnish + Sami, use fi-t-k0-smi or extended-smi</li>
</ol>
</li>
<li>In some cases, there are multiple subtags, like
en-US-t-k0-chromeos-intl-altgr.xml</li>
<li>Otherwise, platform names are used as a guide.</li>
</ol>
<h2>
10 <a name="Platform_Behaviors_in_Edge_Cases"
href="#Platform_Behaviors_in_Edge_Cases">Platform Behaviors in
Edge Cases</a>
</h2>
<table>
<!-- nocaption -->
<tbody>
<tr>
<td><p>Platform</p></td>
<td><p>No modifier combination match is available</p></td>
<td><p>No map match is available for key position</p></td>
<td><p>Transform fails (ie. if ^d is pressed when that
transform does not exist)</p></td>
</tr>
<tr>
<td><p>ChromeOS</p></td>
<td><p>Fall back to base</p></td>
<td><p>
Fall back to character in a keyMap with same "level" of modifier
combination. If this character does not exist, fall back to (n-1)
level. (This is handled data-generation side).<br> In the
spec: No output
</p></td>
<td><p>No output at all</p></td>
</tr>
<tr>
<td><p>Mac OSX</p></td>
<td><p>Fall back to base (unless combination is some sort
of keyboard shortcut, eg. cmd-c)</p></td>
<td><p>No output</p></td>
<td><p>Both keys are output separately</p></td>
</tr>
<tr>
<td><p>Windows</p></td>
<td><p>No output</p></td>
<td><p>No output</p></td>
<td><p>Both keys are output separately</p></td>
</tr>
</tbody>
</table>
<p> </p>
<hr>
<p class="copyright">
Copyright © 2001–2018 Unicode, Inc. All Rights Reserved. The Unicode
Consortium makes no expressed or implied warranty of any kind, and
assumes no liability for errors or omissions. No liability is assumed
for incidental and consequential damages in connection with or
arising out of the use of the information or programs contained or
accompanying this technical report. The Unicode <a
href="http://unicode.org/copyright.html">Terms of Use</a> apply.
</p>
<p class="copyright">Unicode and the Unicode logo are trademarks
of Unicode, Inc., and are registered in some jurisdictions.</p>
</div>
</body>
</html>