Starting from this topic you can find the complete list of characters available for use in Duxbury DBT. The list has become remarkably long, and so this topic provides links to separate tables of those characters, grouped by their Unicode range. When you link to the individual tables of characters, each character has two identification numbers, its Unicode number and its DUSCI number - the DUSCI number is the internal code for a character in DBT.
Prior to presenting the Table of Unicode Ranges, we briefly discuss several script systems that have special characteristics to be aware of if you need to work with them, starting with the Chinese Han characters.
Note: If you find you need help on ways of accessing special characters or alternative scripts, see the topic, Keyboarding into Word and DBT.
DBT does not display traditional Chinese ideogram characters. Instead DBT displays an appropriate substitute alphabet based on the language: Mandarin, Cantonese, Japanese, or Korean - the "Han" group of language scripts. Normally, the language is selected automatically just by selecting the DBT template for importing the file. At need, the automatic selection can be overridden by forcing the language choice in the Global: Import Options dialog, but in almost all cases, letting the template select the correct language is what you want.
These are the choices for how Han characters are imported into DBT:
Language | DBT Characters |
---|---|
Mandarin (mainland) | Pinyin Romanization with accent marks for the tones |
Mandarin (Taiwan) | Zhuyin Romanization |
Cantonese | Romanization with superscript numbers for the tones |
Japanese | Unicode U+30xx characters |
Korean | Unicode U+11xx characters |
Hangul often compacts 2 or 3 characters into a single symbol. When DBT imports a file, the process is reversed, DBT breaks a single Hangul character into its component parts. In technical terms, all Hangul characters in the range from U+AC00 through U+D7AF are redirected into Hangul Jamo characters in the U+11xx range. DBT uses a mono-spaced font to display those characters. The result can be difficult to read and is certainly jarring to those who are accustomed to reading conventional inkprint Hangul. In this case, it is best to do all editing in Microsoft Word, using DBT as the translation engine and for output.
Both Arabic and Hebrew inkprint are written from right to left. DBT displays the inkprint Arabic and Hebrew text from right to left as well. However, the DBT editing cursor does not (as yet) accommodate the right to left flow within a line.
For these scripts, it is best to do all editing in Microsoft Word, using DBT solely as the translation engine and output manager. In this case, if you need to, you can clipboard whole lines from Word into DBT as a way of making changes within a line.
As can be seen in the further tables, DBT can import a vast number of characters not found in this table.
The table in this section lists the Unicode ranges that are supported inside of DBT. The first column identifies the character set, the second column gives the start of the Unicode range for that set, and the third column indicates the use of that character set. The possible uses are: for a language, for mathematics, for symbols, or for the International Phonetic Alphabet. The entries in the second column are hyperlinks to the individual tables, each one showing the full Unicode range for that character set and the DUSCI equivalent for each Unicode character.
Script | Unicode and Link | Type |
---|---|---|
Latin | U+00xx | Language |
Latin Extended | U+01xx | Language |
Latin Extended | U+02xx | Language |
Greek | U+03xx | Language |
Cyrillic | U+04xx | Language |
Armenian and Hebrew | U+05xx | Language |
Arabic | U+06xx | Language |
Syriac and Thaana | U+07xx | Language |
Hindi and Bengali | U+09xx | Language |
Gurmukhi and Gujarati | U+0Axx | Language |
Oriya and Tamil | U+0Bxx | Language |
Telugu and Kannada | U+0Cxx | Language |
Malayalam and Sinhala | U+0Dxx | Language |
Thai and Lao | U+0Exx | Language |
Tibetan | U+0Fxx | Language |
Myanmar and Georgian | U+10xx | Language |
Korean | U+11xx | Language |
Ethiopic | U+12xx | Language |
Ethiopic | U+13xx | Language |
Khmer (Cambodian) | U+17xx | Language |
IPA 1 | U+1Dxx | IPA |
IPA 2 | U+1Exx | IPA |
Misc Symbols | U+20xx | Symbols |
Arrows, etc. | U+21xx | Math |
Math Operators | U+22xx | Math |
Misc Technical | U+23xx | Math |
Box Drawing | U+25xx | Math |
Dingbats | U+27xx | Symbols |
Math Arrows | U+29xx | Math |
Math Operators | U+2Axx | Math |
Japanese | U+30xx | Language |
DBT 12.7 sr1 can read thousands of characters beyond U+FFFF. Please read this external web page for the details.
This second, very extensive, table lists all of the 4-digit Unicode ranges, including those not supported in DBT.
The first column indicates the start and end of each range. The second column identifies the character set or other use covered by that range. You will see that most of these entries are hyperlinks.
Note, however, that these are all external hyperlinks to web pages, not links to pages inside DBT Help.
If the last column is blank, then there is no support for these characters in Duxbury DBT. If the last column is "Word import", it means that these characters are supported by conversion into other Duxbury supported characters during import. If the last column is "DUSCI supported" then that Unicode range is in the previous table (in the section above).
Unicode Range | Name and Wikipedia Link | Support Level |
---|---|---|
U+0000-007F | Basic Latin | DUSCI supported |
U+0080-00FF | Latin-1 Supplement | DUSCI supported |
U+0100-017F | Latin Extended-A | DUSCI supported |
U+0180-024F | Latin Extended-B | DUSCI supported |
U+0250-02AF | IPA Extensions | DUSCI supported |
U+02B0-02FF | Spacing Modifier Letters | DUSCI supported |
U+0300-036F | Combining Diacritical Marks | DUSCI supported |
U+0370-03FF | Greek and Coptic | DUSCI supported |
U+0400-04FF | Cyrillic | DUSCI supported |
U+0500-052F | Cyrillic Supplement | DUSCI supported |
U+0530-058F | Armenian | DUSCI supported |
U+0590-05FF | Hebrew | DUSCI supported |
U+0600-06FF | Arabic | DUSCI supported |
U+0700-074F | Syriac | DUSCI supported (use Biblical languages) |
U+0750-077F | Arabic Supplement | DUSCI supported (use West African languages) |
U+0780-07BF | Thaana | DUSCI supported (use Dhivehi) |
U+07C0-07FF | N'Ko | DUSCI supported (use West African languages) |
U+0800-083F | Samaritan | |
U+0840-085F | Mandaic | Word import |
U+0860-086F | Syriac supplement (Malayalam) | |
U+0870-089F | Arabic extended-B | |
U+08A0-08AF | Arabic extended-A | |
U+0900-097F | Devanagari | DUSCI supported |
U+0980-09FF | Bengali | DUSCI supported |
U+0A00-0A7F | Gurmukhi | DUSCI supported |
U+0A80-0AFF | Gujarati | DUSCI supported |
U+0B00-0B7F | Oriya | DUSCI supported |
U+0B80-0BFF | Tamil | DUSCI supported |
U+0C00-0C7F | Telugu | DUSCI supported |
U+0C80-0CFF | Kannada | DUSCI supported |
U+0D00-0D7F | Malayalam | DUSCI supported |
U+0D80-0DFF | Sinhala | DUSCI supported |
U+0E00-0E7F | Thai | DUSCI supported |
U+0E80-0EFF | Lao | DUSCI supported |
U+0F00-0FFF | Tibetan | DUSCI supported |
U+1000-109F | Myanmar | DUSCI supported |
U+10A0-10FF | Georgian | DUSCI supported |
U+1100-11FF | Hangul Jamo | DUSCI supported (redirected Korean) |
U+1200-127F | Ethiopic (Ge'ez) | DUSCI supported |
U+1380-139F | Ethiopic Supplement | DUSCI supported |
U+13A0-13FF | Cherokee | experimental, contact languages@duxsys.com |
U+1400-167F | Unified Canadian Aboriginal Syllabics | Word import (use English UEB) |
U+1680-169F | Ogham | |
U+16A0-16FF | Runic | |
U+1700-171F | Tagalog (Baybayin) | Word import |
U+1720-173F | Hanunoo | Word import |
U+1740-175F | Buhid | Word import |
U+1760-177F | Tagbanwa | Word import |
U+1780-17FF | Khmer | DUSCI supported |
U+1800-18AF | Mongolian | |
U+18B0-18FF | Extended Canadian Aboriginal syllabics | Word import |
U+1900-194F | Limbu | Word import (use Hindi) |
U+1950-197F | Tai Le | Word import |
U+1980-19DF | New Tai Lue | Word import |
U+19E0-19FF | Khmer Symbols | |
U+1A00-1A1F | Buginese (Lontara) | Word import |
U+1A20-1AAF | Tai Tham | |
U+1AB0-1AFF | Combining Diacritical Marks | |
U+1B00-1B7F | Balinese | |
U+1B80-1BBF | Sundanese | Word import |
U+1BC0-1BFF | Batak | Word import |
U+1C00-1C4F | Lepcha | Word import |
U+1C50-1C7F | Ol Chiki | |
U+1CC0-1CCF | Sundanese supplement | |
U+1CD0-1CFF | Vedic Extentions | |
U+1D00-1D7F | Phonetic Extensions | Word import |
U+1D80-1DBF | More Phonetic Extensions | Word import |
U+1DC0-1DFF | Combining Diacritical Marks Supplement | |
U+1E00-1EFF | Latin Extended Additional | DUSCI supported |
U+1F00-1FFF | Greek Extended | Word import |
U+2000-206F | General Punctuation | DUSCI supported |
U+2070-209F | Superscripts and Subscripts | DUSCI supported |
U+20A0-20CF | Currency Symbols | DUSCI supported |
U+20D0-20FF | Combining Diacritical Marks for Symbols | |
U+2100-214F | Letterlike Symbols | DUSCI supported |
U+2150-218F | Number Forms | DUSCI supported |
U+2190-21FF | Arrows | DUSCI supported |
U+2200-22FF | Mathematical Operators | DUSCI supported |
U+2300-23FF | Miscellaneous Technical | DUSCI supported |
U+2400-243F | Control Pictures | Word import |
U+2440-245F | Optical Character Recognition | Word import |
U+2460-24FF | Enclosed Alphanumerics | Word import |
U+2500-257F | Box Drawing | DUSCI supported |
U+2580-259F | Block Elements | DUSCI supported |
U+25A0-25FF | Geometric Shapes | DUSCI supported |
U+2600-26FF | Miscellaneous Symbols | Word import |
U+2700-27BF | Dingbats | DUSCI supported |
U+27C0-27EF | Miscellaneous Mathematical Symbols-A | DUSCI supported |
U+27F0-27FF | Supplemental Arrows-A | DUSCI supported |
U+2800-28FF | Braille Patterns | Word import |
U+2900-297F | Supplemental Arrows-B | DUSCI supported |
U+2980-29FF | Miscellaneous Mathematical Symbols-B | DUSCI supported |
U+2A00-2AFF | Supplemental Mathematical Operators | DUSCI supported |
U+2B00-2BFF | Miscellaneous Symbols and Arrows | |
U+2C00-2C5F | Glagolitic | |
U+2C60-2C5F | Latin Extended-C | |
U+2C80-2CFF | Coptic | Word Import (use Biblical languages) |
U+2D00-2D2F | Georgian Supplement | |
U+2D30-2D7F | Tifinagh | Word import (use West African languages) |
U+2D80-2DDF | Ethiopic Extended | |
U+2DE0-2DFF | Cyrillic Extended | |
U+2E00-2E7F | Supplemental Punctuation | |
U+2E80-2EFF | CJK Radicals Supplement | |
U+2F00-2FDF | Kangxi Radicals | |
U+2FF0-2FFF | Ideographic Description Characters | |
U+3000-303F | CJK Symbols and Punctuation | DUSCI supported (use Japanese) |
U+3040-309F | Hiragana | DUSCI supported (use Japanese) |
U+30A0-30FF | Katakana | DUSCI supported (use Japanese) |
U+3100-312F | Bopomofo | DUSCI supported |
U+3130-318F | Hangul Compatibility Jamo | Word import |
U+3190-319F | Kanbun | |
U+31A0-31BF | Bopomofo Extended | |
U+31C0-31EF | CJK Strokes | |
U+31F0-31FF | Katakana Phonetic Extensions | |
U+3200-32FF | Enclosed CJK Letters and Months | |
U+3300-33FF | CJK Compatibility | |
U+3400-4DBF | CJK Unified Ideographs Extension A | Word import (see chart above on Chinese) |
U+4DC0-4DFF | Yijing Hexagram Symbols | Word import (see chart above on Chinese) |
U+4E00-9FFF | CJK Unified Ideographs | Word import (see chart above on Chinese) |
U+A000-A48F | Yi Syllables | |
U+A490-A4CF | Yi Radicals | |
U+A4D0-A4FF | Lisu (Fraser alphabet) | Word import |
U+A500-A59F | Vai | DUSCI supported (use West African languages) |
U+A640-649F | Cyrillic Extended | |
U+A6A0-A6FF | Bamum | |
U+A700-A71F | Modifier Tone Letters | |
U+A720-A7FF | Latin Extended-D | |
U+A800-A82F | Syloti Nagri | Word import (use Bengali) |
U+A830-A83F | Common Indic Number Forms | |
U+A840-A87F | Phags-pa | Word import |
U+A880-A8DF | Saurashtra | Word import (use Hindi) |
U+A8E0-A8FF | Devanagari Extended | |
U+A900-A92F | Kayah Li | |
U+A930-A95F | Rejang | Word import |
U+A960-A97F | Hangul Extended | |
U+A980-A9DF | Javanese | |
U+A9E0-A9FF | Myanmar Extended-B | |
U+AA00-AA5F | Cham | Word import |
U+AA60-AA7F | Myanmar Extended | |
U+AA80-AADF | Tai Viet | |
U+AB00-AB2F | Ethiopic Extended-A | |
U+AB30-AB6F | Latin Extended-E | |
U+AB70-ABBF | Cherokee supplement | Word import |
U+ABC0-ABFF | Meitei Mayek | |
U+AC00-D7AF | Hangul Syllables | Word import (Korean) |
U+D800-DB7F | High Surrogates | |
U+DB80-DBFF | High Private Use Surrogates | |
U+DC00-DFFF | Low Surrogates | |
U+E000-F8FF | Private Use Area | |
U+F900-FAFF | CJK Compatibility Ideographs | |
U+FB00-FB4F | Alphabetic Presentation Forms | Word import (use Hebrew) |
U+FB50-FDFF | Arabic Presentation Forms-A | Word import |
U+FE00-FE0F | Variation Selectors | |
U+FE10-FE1F | Vertical Forms | |
U+FE20-FE2F | Combining Half Marks | |
U+FE30-FE4F | CJK Compatibility Forms | |
U+FE50-FE6F | Small Form Variants | Word import (use Arabic) |
U+FE70-FEFF | Arabic Presentation Forms-B | Word import |
U+FF00-FFEF | Halfwidth and Fullwidth Forms | Word import (use Japanese) |
U+FFF0-FFFF | Specials |
DBT 12.7 sr1 can read thousands of characters beyond U+FFFF. Please read this external web page for the details.