This is a help-text file for use with the survey tool. You can add a new row, where the Path is a regular expression for an XML path, and the Text to Insert is what you want to show up as help text, or modify existing text. The software that interprets this expects a particular format, so don't make arbitrary changes (see the end).
Path | Text to Insert | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
//ldml/localeDisplayNames.* |
Display Names
Languages, scripts (writing systems), territories (countries and
regions), currencies, and time zones are represented in computers
by internal codes, such as "
The ISO names and the "official" names are often not the
best ones for CLDR. The goal is the most customary name used in
your language, even if it is not the official name. For example,
for the territory name in English you would use "Switzerland"
instead of "Swiss Confederation", and use "United Kingdom" instead
of "The United Kingdom of Great Britain and Northern Ireland". The
best source for customary usage is to look at what common
publications such as newspapers and magazines do. For example, to
see how Congo is used in French, one might search
http://www.google.com/search?q All names must be unique within a given category: thus one cannot use the same translated name for the following two codes; only one can be called "Congo":
Avoid using commas and avoid inverting the name (eg "Congo, Democratic Republic of the"). The characters "(" and ")" are discouraged, since they will be confusing in combination with countries in locale names. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/(keys|types).* |
KeysThe keys page lists the key names for translation. These identify particular key words used to identify particular types of variants. The calendar types are typically only used with certain languages, however, they can be used with almost any language:
The collation (sort order) types, on the other hand, are only used with certain locales (listed below):
The last value (
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/territories.* |
TerritoriesTerritories include both country names and regions: continents and subcontinents (defined by a UN standard). All of these must be unique: for example, you can't give the same name to the country South Africa (the country) and to Southern Africa (the southern region of the continent of Africa), even though there may be no distinction in your language between the terms for "South" and "Southern". Similarly, North America is the continent that extends down to Panama; Northern America is the region of the Americas north of Mexico. The country name should be the most natural; you may have to adjust the name of the region. So you might say the equivalent of "South Region of Africa", or add clarifying language like "Amérique du Nord continentale" vs "Amérique du Nord". If you have any question as to the extent of any region, see Territory Containment.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/languages.* |
LanguagesThere are a lot of languages here (around 500), and you don't need to look at them all! Many are relatively obscure, and not worth translating in a first pass. Please also look at the following points.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/languages.*\[@type="[^"]*_[^"]*"\].* |
Compound Language CodesSome language codes are more complex, of the form "en_AU" for Australian English. If you don't add a translation, then those will be represented by a format like "αγγλικά (Αυστραλία)". That is, the translation would be the native name for "English", followed by the native word for "Australia" in parentheses. If that format is ok, then you don't need to translate the more complex language code. The codes zh_Hant and zh_Hans (for Traditional and Simplified Chinese) on the other hand, should always be translated. There are a few special cases:
A pattern is used to control how the translations for language and region codes are composed into a name when the compound code doesn't have a specific translation. See the section "localeDisplayPattern". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/scripts.* |
ScriptsNormally only a few scripts are really necessary to translate: those that are used in distinguishing the most common languages that are written in multiple ways. These are Hant and Hans (for traditional and simplified Chinese), Cyrillic, Arabic, and Latin.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/currencies.* |
CurrenciesThis is a long list that contains the currency names and currency symbols for each country, plus historical codes. The coverage level option tries to pick out the ones that are most important to translate. Each currency code can be translated in two ways:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/characters/exemplarCharacters.* |
Exemplar Character SetThe exemplar character sets contain the commonly used letters for a given modern form of a language. These are used for testing and for determining the appropriate repertoire of letters for charset conversion or text comparison. The term "letter" is interpreted broadly, and includes characters used to form words, such as 是 or 가. If a sequence of characters is considered a "letter", it will be listed between { and }. For example, {ch}. There are three categories:
![]()
Any range of characters, such as "a b c d e" can be represented compactly as "a-e". For more information, please see Section 5.6 Character Elements in UTS#35: Locale Data Markup Language (LDML). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/numbers.* |
NumbersNumbers are formatted using patterns, like "#,###.00". Different characters stand for different parts of the number: they don't have their normal meaning! In particular, you need to use '.' for the decimal point and ',' for the thousands (grouping) separator, even if they are not used that way in your language. Here are the special characters used in number patterns.
For example, the pattern "#,###.00" when used to format the number 12345.678 could result in "12'345,67". That would happen if the grouping separator for your language is an apostrophe, and the decimal separator is a comma. Translators should not change the pattern of zeros (0) or hash marks (#); those will be reset by software. This is true also for currency formats. Even if your currency doesn't use any decimal points, the currency format will have them in the pattern. You need to modify the patterns when:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/.*/(pattern|dateFormatItem|intervalFormats).* |
Formats for Dates and TimesDates and times are formatted using patterns, like "mm-dd". Each field, like the month or the hour, is represented by a sequence of letters from A to Z. For example, one or more M's stand for the month. When the software formats a date for your language, a value will be substituted for each field, according to the following table.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/.*/intervalFormats.* |
Interval FormatsInterval formats are used for a range of dates or times specified by a start and end, such as "Sept 10-12" (meaning the 10th of September through the 12th of September). The pattern will be something like "MMM d–d", where some of the fields are repeated -- typically with some kind of punctuation mark separating the two fields, but some fields in the second part are omitted. The way this pattern is used is that the part up to the first repeated field is formatted with the first date, and the remainder is formatted with the second date. For example:
Each combination of fields can be used with dates that differ by different amounts. For example, a format for the fields "yMMMd" (year, abbreviated month, and day) could be used with two dates that differ by year, month, or day -- each type of difference might need a different pattern. For example:
Look carefully at each of the examples to see the kinds of formats that would be used in your language. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/.*Context.* |
Stand-Alone vs. Format StylesSome languages use two different forms of strings (stand-alone and format) depending on the context. Typically the stand-alone version is the nominative form of the word, and the format version is in the genitive. Make sure that the correct forms are provided, especially for the months, and used in the patterns. That is, suppose that the language uses "Dezembro" for December when standing alone, but "Dezembru" when with a date (meaning the nth day of that month). Then the formats for months could be something like:
Similarly, suppose that your language formats months differently if they have vowels, eg "14 de gener de 2008" but "14 d'abril de 2008". In that case, the stand-alone and format versions of the months should be:
These must be coordinated with the format strings, which can't have the extra "de" before the month:
That is, if your language uses two different forms, then make sure that there are two forms of the months or days where necessary, and adjust the date patterns to use the LLL or LLLL stand-alone form or MMM and MMMM format forms, as needed. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*timeFormatLength |
Standard Time FormatsThere are four standard time formats.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*/quarters/.* |
QuartersThe quarters of a year are used in formats such as "2006Q3", typically used for financial periods. If your language doesn't have a common term for this, you might use the equivalent of "Jan-Mar". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*/fields.*displayName.* |
Date Field LabelsThe date field labels are the names of the dates or time field, such as "Month" or "Hour", suitable for labels in dialogs or menus. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*/fields.*relative.* |
Relative Periods of TimeRelative fields of time are used to indicate a period relative to today, like "Yesterday" or "Tomorrow". Some languages don't have words or short phrases for some of these. For example, English does not have a word for "the day before yesterday" as some languages do, such as "Vorgestern" in German. If your language doesn't have a natural term for one of these, please do not supply a translation: instead, pick the "inherited" value, such as The day after tomorrow . The English phrase supplied here is just a placeholder to let you know what the field means, and is not part of the actual English locale data. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*/(a|p)m |
AM and PMNote that even if your language doesn't use am/pm in any patterns, strings for those need to be defined for testing. As long as the 24 hour symbol (H) is used in the patterns, it won't show up in formatted times and dates. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/calendars/calendar.*dateTimeFormatLength.* |
Date-Time PatternThe date-time pattern is used to make a date + time out of separate date and time patterns. The date will be substituted for {1} and the time for {0}. It usually doesn't need to be changed. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*narrow.* |
Narrow Date FieldsThe narrow date fields are the shortest possible names (in terms of width in common fonts), and are not guaranteed to be unique. Think of what you might find on a credit-card-sized wallet or checkbook calendar, such as in English for days of the week: S M T W T F S |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/eras.* |
ErasThere are only two values for an era in a Gregorian calendar, "BC" and "AD". These values can be translated into other languages, like "a.C." and and "d.C." for Spanish, but there are no other eras in the Gregorian calendar. Other calendars have a different numbers of eras. The names for eras are often specific to the given calendar, such as the Japanese era names. You only typically need to translate these if the calendar in question is in common use in one of the countries that uses your language. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/references.* |
ReferencesReferences are used to document more controversial cases. Whenever there is a disagreement between translators, or when the choice of translation might not be understood, you should add a reference.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/exemplarCity.* |
Time Zone Exemplar CitiesFor generic references to time zones, the country is used if possible, composed with a pattern that in English appears as "{0} Time". Thus a time zone may appear as "Malaysia Time" or "Hora de Malasia". If the country has multiple time zones, then a city is used to distinguish which one, thus "Argentina (La Rioja) Time". Thus normally cities thus only need to be translated if they are in a country with multiple time zones. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*(M|m)etazone.* |
Metazones
For some time zones, the survey tool will state that a particular
metazone is in effect. A metazone is simply a grouping of
time zones that share a common display name in customary usage.
For example,
Often there are situations where a particular time zone has an abbreviation, but the abbreviation is so seldom used that most people would not recognize it. The "commonlyUsed" field for a metazone is used to indicate that abbreviations for a particular time zone or metazone are in common use in the locale. You have two choices:
For example: In English, PST is a commonly used abbreviation for
"Pacific Standard Time", for the metazone
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/posix/messages.* |
POSIX Yes and NoThe POSIX yes and no strings should be whatever should count for "No" and "Yes" in your language, plus abbreviations. Don't worry about uppercases, that will be done automatically. Multiple forms can be entered separated by ":", such as "ne:n". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/layout/in(List|Text).* |
Casing VerificationThese values can be used to help testing. If the value is set to anything but "mixed", then the items of that type will be checked whether they match, to help to catch inconsistencies. For example, if your language usually has the names of territories in lowercase, then set the value for territories to be "lowercase-words". The values are:
The layout/inList item has the
same values, but a different use. It signals that if the items are
put into a list (such as a menu on a computer), then they should
be mechanically changed. For example, suppose that names of
languages are normally lowercase, but when put into a menu they
should normally have the first letter of the first word
capitalized. If that's true, then you should set this value to
If that value is wrong for any individual item, then you can override that particular item by adding an "alt" value. To do so, contact your administrator. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/delimiters/.* |
DelimitersChange this field if your language uses different quotation marks. The alternate forms are for embedded quotations, such as "He said 'Stop!'". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/dateRangePattern.* |
Ranges of DatesModify this field to control how a range of dates appears, eg "Oct 12 - Nov 9". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/timeZoneNames/fallbackFormat |
Country-Based Time Zone City PatternModify this field to control the formatting of Country-Based time zone display when a country has multiple time zones, and the city is used to disambiguate them. In the pattern, {0} will be replaced by the city and {1} will be the country. This is normally not changed, except perhaps in languages that don't use spaces. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/timeZoneNames/gmtFormat |
GMT PatternModify this field if the format for GMT time uses different letters, such as HUA+0200 for GMT+02:00, or if the letters GMT occur after the time. Make sure you include the {0}; that is where the actual time value will go! |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/timeZoneNames/hourFormat |
GMT Hours PatternThis field controls the format for the time used with the GMT Pattern. It contains two patterns separated by a ";". The first controls positive time values (and zero), and the second controls the negative values. So to get GMT+02.00 for positive values, and GMT-02.00 for negative values, you'd use +HH.mm;-HH.mm. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/timeZoneNames/regionFormat |
Country-Based Time Zone PatternFor generic references to time zones, the country is used if possible, composed with a pattern that in English appears as "{0} Time". Thus a time zone may appear as "Malaysia Time" or "Hora de Malasia". If the country has multiple time zones, then a city is used to distinguish which one, thus "Argentina (La Rioja) Time". Some languages would normally have grammatical adjustments depending on what the name of the city is. For example, one might need "12:43 pm Tempo d'Australia" but "12:43 pm Tempo de Paris". In that case, there are two approaches:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/.*/days/.* |
Days of the WeekThis field is one of the days of the week, such as Sunday or Monday. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
.*/timeZoneNames.* |
Time Zones
In the standard used for time zones, a time zone is an area
of a country that has consistent behavior in terms of its offset
from Greenwich Mean Time. In particular, within that zone, the
same daylight-savings (summer-time) behavior is observed, now and
in the past and future (as far as is known). This means that time
zones are fairly fine granularity, as you can see by consulting Territory
Containment.
The name of the time zone is taken from the most populous city,
such as
Time zones can be displayed in a variety of ways, depending on the environment and program requirements. Here are some examples:
These are composed from different pieces that you translate.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/dates/.*/months/.* |
Months of the YearThis field is one of the months of the year, such as January or February. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/fallback |
Locale FallbacksYou should add here a list of locales that would be most natural to use when no translation is available (this is called a fallback). This is especially useful for minority languages. For example, for Breton [br] the most natural language to fall back to might be French [fr], that is, to use French names for countries that aren't translated. Similarly, the fallback for Moldavian [mo] might be Romanian [ro]. Fallbacks should only be included if a substantial majority of people speaking the language in question would be likely to understand the fallback language. If there are no such languages, the fallback field should be left blank.
Fallbacks can take the script or region into account; the fallback
for Northern Sámi (Finland) [se-FI] might be Finnish (Finland)
[fi-FI], while the fallback for Northern Sámi [se] generally might
be Norwegian [no].
The values you need to use are locale codes, not the names or translations;
thus you would put in
Multiple fallback languages can be entered in order of priority, separated by spaces, for example: nl en. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/(units/unit|numbers/currencies/currency.*/displayName).* |
Localized UnitsLocalized units provide more natural ways of expressing unit phrases that vary in plural form, such as "1 hour" vs "2 hours". While they cannot express all the intricacies of natural languages, they allow for more natural phrasing than constructions like "1 hour(s)". Please review the draft rules that CLDR is using for plurals for your language, at Language Plural Rules, and the description there about the plural categories. Each unit may have multiple plural forms, one for each category. These are composed with numbers using a unitPattern of the form "{0} {1}". A formatted number will be substituted in place of the "{0}", while the unit value will be subsituted in place of the "{1}". For example, for English if the unit is an hour and the number is 1234, then the number is looked up to get the rule category other. The number is then formatted into "1,234" and composed with the unitName for other and the unitPattern for other to get the final result. Examples are in the table below.
There is one "default" unitPattern for each plural category, listed under the unit "one". If the particular unit needs a special unitPattern for a particular plural category, then one can also be added. That is, suppose that for a particular language, in the plural the number goes after the translation of hour instead of before. Then for the unit hour, and plural category other, the unitPattern can be different if needed. The key is, if the examples look ok you shouldn't need to do anything. To request a change in the plural rules, please file a request in a bug report. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/localeDisplayPattern/localePattern.* |
Locale Display PatternsLocale display patterns are used to format a compound language (locale) name such as 'en_AU' or 'uz_Arab'. The pattern is something like "{0} ({1})". When the locale is formatted, the language is substituted for {0}, and the region or script for {1}. For example, take "en_AU". First the language code 'en' is translated, such as to "anglais", then the country is translated, such as "Australie". The patterns is used to put those together, into something like "anglais (Australie)". This works the same way if there is a script; for example, "uz-Arab" => "ouzbek (arabe)". If there is both a script and a region, then a list is formed using the separator, then {1} is replaced by that list, such as "uz-Arab-AF" => "ouzbek (arabe, Afghanistan)"For certain compound language (locale) names, you can also supply specific translations. Thus for the whole locale 'en_GB', you can provide a translation like "Australian English". |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
//ldml/localeDisplayNames/codePatterns/codePattern.* |
Code PatternsCode patterns are used in lists where the name of the language, script, or region is not available -- the code (like "de" for German) will be substituted for the {0} placeholder. Thus you if the language code 'zaz' is not translated in your language, you might see in a list something like:
|
The text to insert can be fairly arbitrary HTML. The software that
reads this table will search the first column (eg between <td>
and </td>) and return the contents of the second column. We plan
on adding a few variables also, for the current locale name, in
particular. This file uses the survey tool style-sheet, so you can use
those styles (and icons, like
) in the text to insert.
WARNING