Skip to content

Commit 229cd39

Browse files
committed
DOCSP-30552: Collations guide (#82)
(cherry picked from commit 47819c4)
1 parent 5655c96 commit 229cd39

File tree

4 files changed

+456
-2
lines changed

4 files changed

+456
-2
lines changed

source/fundamentals.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,11 @@ Fundamentals
2424
/fundamentals/performance
2525
/fundamentals/runtimes
2626
/fundamentals/monitoring
27+
/fundamentals/collations
2728

2829
..
2930
Connect to MongoDB Atlas from AWS Lambda <https://www.mongodb.com/docs/atlas/manage-connections-aws-lambda/>
3031
/fundamentals/transactions
31-
/fundamentals/collations
3232
/fundamentals/gridfs
3333
/fundamentals/encrypt-fields
3434
/fundamentals/geo

source/fundamentals/collations.txt

Lines changed: 363 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,363 @@
1+
.. _rust-collations:
2+
3+
==========
4+
Collations
5+
==========
6+
7+
8+
.. facet::
9+
:name: genre
10+
:values: reference
11+
12+
.. meta::
13+
:keywords: string ordering, code example, french language
14+
15+
.. contents:: On this page
16+
:local:
17+
:backlinks: none
18+
:depth: 2
19+
:class: singlecol
20+
21+
Overview
22+
--------
23+
24+
In this guide, you can learn how to use **collations** to order your find or
25+
aggregation operation results by string values. A collation is a set of character
26+
ordering conventions that correspond to a specific language and locale.
27+
28+
This guide includes the following sections:
29+
30+
- :ref:`rust-mongodb-collations` describes how MongoDB sorts string values according
31+
to the default collation and custom collations
32+
- :ref:`rust-specify-collation` describes how to create a ``Collation`` struct instance
33+
- :ref:`rust-collection-collation` describes how to set the collation for a
34+
new collection
35+
- :ref:`rust-index-collation` describes how to set the collation for an index
36+
- :ref:`rust-op-collation` describes how to apply a collation to certain CRUD operations
37+
- :ref:`rust-collation-addtl-info` provides links to resources and API documentation
38+
for types and methods mentioned in this guide
39+
40+
.. _rust-mongodb-collations:
41+
42+
MongoDB Collations
43+
------------------
44+
45+
MongoDB sorts strings using *binary collation* by default. This collation method
46+
uses the ASCII standard character values to compare and order strings. Certain
47+
languages and locales have specific character ordering conventions that differ from
48+
the ASCII standard.
49+
50+
.. tip::
51+
52+
To learn more about the ASCII standard, see the :wikipedia:`ASCII
53+
<w/index.php?title=ASCII&oldid=1180396712>` Wikipedia page.
54+
55+
For example, in Canadian French, the right-most accented character determines
56+
the ordering for strings when the other characters are the same. Consider the
57+
following Canadian French words:
58+
59+
- cote
60+
- coté
61+
- côte
62+
- côté
63+
64+
When using the default binary collation, MongoDB sorts the words in the following order:
65+
66+
.. code-block:: none
67+
:copyable: false
68+
69+
cote
70+
coté
71+
côte
72+
côté
73+
74+
In this sort order, "coté" is placed before "côte" because the ASCII standard positions the
75+
character "o" before the character "ô".
76+
77+
When using the Canadian French collation, MongoDB sorts the words in the following order:
78+
79+
.. code-block:: none
80+
:copyable: false
81+
82+
cote
83+
côte
84+
coté
85+
côté
86+
87+
In this sort order, "coté" is placed after "côte" because Canadian French collation rules position
88+
the character "e" before the character "é".
89+
90+
.. _rust-specify-collation:
91+
92+
Specify a Collation
93+
-------------------
94+
95+
You can define a collation by specifying a collation locale and other options in a ``Collation``
96+
struct instance. To begin building a ``Collation`` instance, call the ``Collation::builder()``
97+
method.
98+
99+
.. note:: Instantiating Options
100+
101+
The {+driver-short+} implements the Builder design pattern for the
102+
creation of many different types, including ``Collation``. You can
103+
use the ``builder()`` method to construct an instance of each type
104+
by chaining option builder methods.
105+
106+
The following table describes the builder methods that you can use to set fields of a ``Collation``
107+
instance. You must use the ``locale()`` method to build a valid ``Collation`` struct, but all other
108+
builder methods are optional:
109+
110+
.. list-table::
111+
:widths: 1 1 2
112+
:stub-columns: 1
113+
:header-rows: 1
114+
115+
* - Method
116+
- Possible Values
117+
- Description
118+
119+
* - | ``locale()`` *(Required)*
120+
- | For a full list of supported locales, see
121+
| :manual:`Supported Languages and Locales </reference/collation-locales-defaults/#supported-languages-and-locales>`
122+
| in the Server manual.
123+
- | Specifies the ICU locale
124+
125+
* - ``strength()``
126+
- ``CollationStrength::Primary``,
127+
``CollationStrength::Secondary``,
128+
``CollationStrength::Tertiary``,
129+
``CollationStrength::Quaternary``,
130+
``CollationStrength::Identical``
131+
- Specifies the level of comparison to perform
132+
133+
* - | ``case_level()``
134+
- | ``true``, ``false``
135+
- | Specifies whether the driver performs case comparison
136+
137+
* - ``case_first()``
138+
- ``CollationCaseFirst::Upper``,
139+
``CollationCaseFirst::Lower``,
140+
``CollationCaseFirst::Off``
141+
- Specifies the sort order of case differences during tertiary level comparisons
142+
143+
* - | ``numeric_ordering()``
144+
- | ``true``, ``false``
145+
- | Specifies whether the driver compares numeric strings as numbers
146+
147+
* - ``alternate()``
148+
- ``CollationAlternate::NonIgnorable``,
149+
``CollationAlternate::Shifted``
150+
- Specifies whether the driver considers whitespace and punctuation as base characters
151+
during string comparison
152+
153+
* - | ``max_variable()``
154+
- | ``CollationMaxVariable::Punct``,
155+
| ``CollationMaxVariable::Space``
156+
- | Specifies which characters the driver ignores when ``alternate`` is set to
157+
| ``CollationAlternate::Shifted``
158+
159+
* - ``normalization()``
160+
- ``true``, ``false``
161+
- Specifies whether the driver performs text normalization for string values
162+
163+
* - | ``backwards()``
164+
- | ``true``, ``false``
165+
- | Specifies whether the driver sorts strings containing diacritics in reverse character order
166+
167+
Example
168+
~~~~~~~
169+
170+
The following example specifies a ``Collation`` instance and sets the collation locale to ``"en_US"``:
171+
172+
.. literalinclude:: /includes/fundamentals/code-snippets/collation.rs
173+
:language: rust
174+
:dedent:
175+
:start-after: start-collation
176+
:end-before: end-collation
177+
178+
.. _rust-collection-collation:
179+
180+
Set a Collation on a Collection
181+
-------------------------------
182+
183+
When you create a new collection, you can define the collation for future operations
184+
called on that collection. Set the collation by using the ``collation()`` function
185+
when creating a ``CreateCollectionOptions`` instance. Then, call the ``create_collection()``
186+
method with your options instance as a parameter.
187+
188+
.. _rust-create-collection:
189+
190+
Create Collection with a Collation Example
191+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192+
193+
This example specifies a collation according to the ``"fr"``, or French, locale conventions
194+
and applies the collation to a new collection called ``books``. The ``strength`` field is set to
195+
``CollationStrength::Primary`` to ignore differences in diacritics.
196+
197+
.. literalinclude:: /includes/fundamentals/code-snippets/collation.rs
198+
:language: rust
199+
:dedent:
200+
:start-after: start-create-collection
201+
:end-before: end-create-collection
202+
203+
Collation Ordering Demonstration
204+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
205+
206+
If you run an operation that supports collations on the ``books`` collection, the operation
207+
uses the collation specified in the preceding :ref:`rust-create-collection`.
208+
209+
Assume the ``books`` collection contains the following documents:
210+
211+
.. code-block:: json
212+
213+
{ "name" : "Emma", "length" : "474" }
214+
{ "name" : "Les Misérables", "length": "1462" }
215+
{ "name" : "Infinite Jest", "length" : "1104" }
216+
{ "name" : "Cryptonomicon", "length" : "918" }
217+
{ "name" : "Ça", "length" : "1138" }
218+
219+
.. tip::
220+
221+
To learn how to insert documents into a collection, see the :ref:`rust-insert-guide`
222+
guide.
223+
224+
The following example uses the ``find()`` method to return all documents in which the value
225+
of the ``name`` field alphabetically precedes ``"Infinite Jest"``:
226+
227+
.. io-code-block::
228+
:copyable: true
229+
230+
.. input:: /includes/fundamentals/code-snippets/collation.rs
231+
:language: rust
232+
:dedent:
233+
:start-after: start-default-query
234+
:end-before: end-default-query
235+
236+
.. output::
237+
:language: none
238+
:visible: false
239+
240+
{ "name": "Emma", "length": 474 }
241+
{ "name": "Cryptonomicon", "length": 918 }
242+
{ "name" : "Ça", "length" : "1138" }
243+
244+
245+
If you don't specify a collation for the ``books`` collection, the ``find()`` method follows
246+
default binary collation rules to determine the ``name`` values that precede ``"Infinite Jest"``.
247+
These rules place words beginning with "Ç" after those beginning with "I". So, when the
248+
preceding find operation follows binary collation rules, the document in which the ``name`` value is
249+
``"Ça"`` does not match the filter criteria.
250+
251+
.. _rust-index-collation:
252+
253+
Set a Collation on an Index
254+
---------------------------
255+
256+
When you create a new index on a collection, you can define the collation for operations
257+
that are covered by the index. To run an operation that uses the index and its collation, your
258+
operation and index must specify the same collation.
259+
260+
.. tip::
261+
262+
To learn more about indexes and covered queries, see the :ref:`rust-indexes` guide.
263+
264+
Set the index collation by using the ``collation()`` function to build an ``IndexOptions`` instance.
265+
Then, pass your ``IndexOptions`` as an argument to an ``IndexModel`` builder function, and pass your
266+
``IndexModel`` as an argument to the ``create_index()`` method.
267+
268+
Example
269+
~~~~~~~
270+
271+
The following example uses the ``create_index()`` method to create an ascending index on the
272+
``name`` field and specifies a new collation corresponding to the ``"en_US"`` locale:
273+
274+
.. io-code-block::
275+
:copyable: true
276+
277+
.. input:: /includes/fundamentals/code-snippets/collation.rs
278+
:language: rust
279+
:dedent:
280+
:start-after: start-index
281+
:end-before: end-index
282+
283+
.. output::
284+
:language: none
285+
:visible: false
286+
287+
Created index: name_1
288+
289+
.. _rust-op-collation:
290+
291+
Set a Collation on an Operation
292+
-------------------------------
293+
294+
Operations that read, update, and delete documents from a collection can use collations.
295+
Applying a collation to an operation overrides any collation previously defined for a
296+
collection or index.
297+
298+
If you apply a collation to an operation that differs from an index's collation, you
299+
cannot use that index. As a result, the operation may not perform as efficiently as one that
300+
is covered by an index. For more information on the disadvantages of sorting operations
301+
not covered by an index, see :manual:`Using Indexes to Sort Query Results </tutorial/sort-results-with-indexes/>`
302+
in the Server manual.
303+
304+
Example
305+
~~~~~~~
306+
307+
This example performs the following actions:
308+
309+
- Sets the ``numeric_ordering`` collation option to ``true``, which ensures that values are sorted in
310+
numerical order rather than alphabetical order
311+
- Specifies a collation in a ``FindOptions`` instance, which overrides the collection's collation
312+
- Uses the ``find()`` method to return documents in which the value of the ``length`` field is greater
313+
than ``"1000"``
314+
315+
.. io-code-block::
316+
:copyable: true
317+
318+
.. input:: /includes/fundamentals/code-snippets/collation.rs
319+
:language: rust
320+
:dedent:
321+
:start-after: start-op-collation
322+
:end-before: end-op-collation
323+
324+
.. output::
325+
:language: none
326+
:visible: false
327+
328+
{ "name" : "Les Misérables", "length": "1462" }
329+
{ "name" : "Infinite Jest", "length" : "1104" }
330+
{ "name" : "Ça", "length" : "1138" }
331+
332+
If you run the preceding find operation without setting the ``numeric_ordering`` option to ``true``,
333+
the driver compares ``length`` values as strings and orders the string value ``"1000"`` before the
334+
values ``"474"`` and ``"918"``. In this case, the preceding find operation returns all documents in
335+
the ``books`` collection.
336+
337+
.. _rust-collation-addtl-info:
338+
339+
Additional Information
340+
----------------------
341+
342+
To learn more about the ``find()`` method, see the :ref:`rust-retrieve-guide` guide.
343+
344+
To learn more about collations, see the following Server manual pages:
345+
346+
- :manual:`Collation </reference/collation/>`
347+
- :manual:`Collation Locales and Default Parameters </reference/collation-locales-defaults/>`
348+
349+
API Documentation
350+
~~~~~~~~~~~~~~~~~
351+
352+
To learn more about any of the methods or types mentioned in this guide, see the
353+
following API documentation:
354+
355+
- `Collation <{+api+}/options/struct.Collation.html>`__
356+
- `create_collection() <{+api+}/struct.Database.html#method.create_collection>`__
357+
- `CreateCollectionOptions <{+api+}/options/struct.CreateCollectionOptions.html>`__
358+
- `create_index() <{+api+}/struct.Collection.html#method.create_index>`__
359+
- `IndexOptions <{+api+}/options/struct.IndexOptions.html>`__
360+
- `IndexModel <{+api+}/struct.IndexModel.html>`__
361+
- `find() <{+api+}/struct.Collection.html#method.find>`__
362+
- `FindOptions <{+api+}/options/struct.FindOptions.html>`__
363+

source/includes/fundamentals-sections.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,13 @@ Fundamentals section:
1414
- :ref:`Create a Time Series Collection <rust-time-series>`
1515
- :ref:`Record Driver Events <rust-tracing-logging>`
1616
- :ref:`Run A Database Command <rust-run-command>`
17+
- :ref:`Specify Collations to Order Results <rust-collations>`
1718
- :ref:`Optimize Driver Performance <rust-performance>`
1819
- :ref:`Configure Asynchronous and Synchronous Runtimes <rust-runtimes>`
1920
- :ref:`Monitor Driver Events <rust-monitoring>`
2021

2122
..
2223
- :atlas:`Connect to MongoDB Atlas from AWS Lambda </manage-connections-aws-lambda/>`
23-
- :ref:`Specify Collations to Order Results <rust-collations>`
2424
- :ref:`Store and Retrieve Large Files by Using GridFS <rust-gridfs>`
2525
- :ref:`Encrypt Fields <rust-fle>`
2626
- :ref:`Query and Write Geospatial Data <rust-geo>`

0 commit comments

Comments
 (0)