$regexFind（聚合）

本页内容

定义

语法
行为
示例

定义

$regexFind: 在聚合表达式中提供正则表达式（regex）模式匹配功能。如果找到匹配项，则返回包含有关第一个匹配项信息的文档。如果没有找到匹配项，则返回null。

语法

以下$regexFind运算符具有以下语法

{ $regexFind: { input: <expression> , regex: <expression>, options: <expression> } }

操作字段

字段

描述

input

你希望应用正则表达式的字符串。可以是字符串或任何有效的表达式，该表达式解析为字符串。

regex

要应用的正则表达式模式。可以是任何有效的表达式，该表达式解析为字符串或正则表达式模式 /<pattern>/。当使用正则表达式 /<pattern>/ 时，您还可以指定正则表达式选项 i 和 m（但不能使用 s 或 x 选项）

"pattern"
/<pattern>/
/<pattern>/<options>

或者，您还可以使用选项字段来指定正则表达式选项。要指定 s 或 x 选项，您必须使用选项字段。

您不能同时在 regex 和 options 字段中指定选项。

options

可选。以下 <options> 可用于正则表达式。

您不能同时在 regex 和 options 字段中指定选项。

选项	描述
`i`	不区分大小写，匹配大小写字母。您可以在 `options` 字段或作为正则表达式字段的一部分指定此选项。
`m`	对于包含锚点的模式（例如，对于开始使用 `^`，对于结束使用 `$`），如果字符串值具有多行值，则在每行的开始或结束处进行匹配。如果没有此选项，这些锚点将在字符串的开始或结束处进行匹配。如果模式不包含锚点或字符串值没有换行符（例如，`\n`），则 `m` 选项没有效果。
`x`	"扩展"功能，忽略模式中的所有空白字符，除非它们被转义或在字符类中包含。此外，它忽略包括未转义的井号（`#`）字符和下一个新行之间的所有字符，因此你可以在复杂的模式中包含注释。这仅适用于数据字符；空白字符永远不应出现在模式中的特殊字符序列中。 `x`选项不会影响VT字符（即代码11）的处理。你只能在`options`字段中指定此选项。
	允许点字符（即`.`）匹配所有字符，包括换行符。你只能在`options`字段中指定此选项。

如果运算符找不到匹配项，该运算符的结果为null。

如果运算符找到匹配项，该运算符的结果是一个包含以下内容的文档

在输入中找到的第一个匹配字符串的
在输入中的索引（不是字节索引），以及
与匹配字符串捕获的组相对应的字符串数组。捕获组在正则表达式模式中的未转义括号()中指定。

{ "match" : <string>, "idx" : <num>, "captures" : <array of strings> }

提示

另请参阅

行为

PCRE库

从版本6.1开始，MongoDB使用PCRE2（Perl兼容正则表达式）库来实现正则表达式模式匹配。要了解PCRE2的更多信息，请参阅PCRE文档。

`$regexFind`和排序

$regexFind 会忽略为集合指定的排序规则，db.collection.aggregate() 以及索引（如果使用的话）。

例如，创建一个具有排序强度 1 的示例集合（即仅比较基本字符，忽略大小写和其他差异，如变音符号）。

db.createCollection( "myColl", { collation: { locale: "fr", strength: 1 } } )

插入以下文档

db.myColl.insertMany([
   { _id: 1, category: "café" },
   { _id: 2, category: "cafe" },
   { _id: 3, category: "cafE" }
])

使用集合的排序规则，以下操作执行大小写不敏感和变音符号不敏感的匹配。

db.myColl.aggregate( [ { $match: { category: "cafe" } } ] )

该操作返回以下 3 个文档

{ "_id" : 1, "category" : "café" }
{ "_id" : 2, "category" : "cafe" }
{ "_id" : 3, "category" : "cafE" }

然而，聚合表达式 $regexFind 会忽略排序规则；也就是说，以下正则表达式模式匹配示例是大小写敏感和变音符号敏感的。

db.myColl.aggregate( [ { $addFields: { resultObject: { $regexFind: { input: "$category", regex: /cafe/ }  } } } ] )
db.myColl.aggregate(
   [ { $addFields: { resultObject: { $regexFind: { input: "$category", regex: /cafe/ }  } } } ],
   { collation: { locale: "fr", strength: 1 } }           // Ignored in the $regexFind
)

这两个操作都返回以下内容

{ "_id" : 1, "category" : "café", "resultObject" : null }
{ "_id" : 2, "category" : "cafe", "resultObject" : { "match" : "cafe", "idx" : 0, "captures" : [ ] } }
{ "_id" : 3, "category" : "cafE", "resultObject" : null }

要执行大小写不敏感的正则表达式模式匹配，请使用 i 选项。请参阅 i 选项的示例。

`captures` 输出行为

如果您的 regex 模式包含捕获组，并且模式在输入中找到匹配项，则结果中的 captures 数组对应于匹配字符串捕获的组。捕获组使用正则表达式模式中的未转义括号 () 指定。captures 数组的长度等于模式中的捕获组数量，数组的顺序与捕获组出现的顺序一致。

创建一个名为 contacts 的示例集合，包含以下文档

db.contacts.insertMany([
  { "_id": 1, "fname": "Carol", "lname": "Smith", "phone": "718-555-0113" },
  { "_id": 2, "fname": "Daryl", "lname": "Doe", "phone": "212-555-8832" },
  { "_id": 3, "fname": "Polly", "lname": "Andrews", "phone": "208-555-1932" },
  { "_id": 4, "fname": "Colleen", "lname": "Duncan", "phone": "775-555-0187" },
  { "_id": 5, "fname": "Luna", "lname": "Clarke", "phone": "917-555-4414" }
])

以下管道将正则表达式模式 /(C(ar)*)ol/ 应用于 fname 字段

db.contacts.aggregate([
  {
    $project: {
      returnObject: {
        $regexFind: { input: "$fname", regex: /(C(ar)*)ol/ }
      }
    }
  }
])

正则表达式模式在 fname 值 Carol 和 Colleen 中找到匹配项

{ "_id" : 1, "returnObject" : { "match" : "Carol", "idx" : 0, "captures" : [ "Car", "ar" ] } }
{ "_id" : 2, "returnObject" : null }
{ "_id" : 3, "returnObject" : null }
{ "_id" : 4, "returnObject" : { "match" : "Col", "idx" : 0, "captures" : [ "C", null ] } }
{ "_id" : 5, "returnObject" : null }

该模式包含捕获组 (C(ar)*)，其中包含嵌套组 (ar)。在 captures 数组中的元素对应于两个捕获组。如果一个匹配的文档没有被某个组捕获（例如，Colleen 和组 (ar)），则 $regexFind 将组替换为空占位符。

如前例所示，captures 数组包含每个捕获组的一个元素（使用 null 表示非捕获）。考虑以下示例，它通过将捕获组逻辑 or 应用到 phone 字段来搜索具有纽约市区号的电话号码。每个组代表一个纽约市区号

db.contacts.aggregate([
  {
    $project: {
      nycContacts: {
        $regexFind: { input: "$phone", regex: /^(718).*|^(212).*|^(917).*/ }
      }
    }
  }
])

对于与正则表达式模式匹配的文档，captures 数组包括匹配的捕获组，并用 null 替换任何非捕获组

{ "_id" : 1, "nycContacts" : { "match" : "718-555-0113", "idx" : 0, "captures" : [ "718", null, null ] } }
{ "_id" : 2, "nycContacts" : { "match" : "212-555-8832", "idx" : 0, "captures" : [ null, "212", null ] } }
{ "_id" : 3, "nycContacts" : null }
{ "_id" : 4, "nycContacts" : null }
{ "_id" : 5, "nycContacts" : { "match" : "917-555-4414", "idx" : 0, "captures" : [ null, null, "917" ] } }

示例

`$regexFind` 及其选项

为了说明如本例中所述的 $regexFind 操作符的行为，创建一个包含以下文档的示例集合 products

db.products.insertMany([
   { _id: 1, description: "Single LINE description." },
   { _id: 2, description: "First lines\nsecond line" },
   { _id: 3, description: "Many spaces before     line" },
   { _id: 4, description: "Multiple\nline descriptions" },
   { _id: 5, description: "anchors, links and hyperlinks" },
   { _id: 6, description: "métier work vocation" }
])

默认情况下，$regexFind 会进行大小写敏感的匹配。例如，以下聚合操作对 description 字段进行了大小写敏感的 $regexFind。正则表达式模式 /line/ 没有指定任何分组。

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /line/ } } } }
])

操作返回以下结果

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

以下正则表达式模式 /lin(e|k)/ 在模式中指定了一个分组 (e|k)

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /lin(e|k)/ } } } }
])

操作返回以下结果

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ "e" ] } }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ "e" ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ "e" ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : { "match" : "link", "idx" : 9, "captures" : [ "k" ] } }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

在返回选项中，idx 字段是码点的索引，而不是字节的索引。例如，考虑以下使用正则表达式模式 /tier/ 的示例。

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /tier/ } } } }
])

操作返回以下结果，只有最后一条记录匹配该模式，并且返回的 idx 为 2（如果使用字节索引则为 3）

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : null }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : null }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : null }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation",
             "returnObject" : { "match" : "tier", "idx" : 2, "captures" : [ ] } }

`i` 选项

注意

您不能同时在 regex 和 options 字段中指定选项。

要执行大小写不敏感的匹配模式，请在 i 选项中将 regex 字段或 options 字段作为部分包含。

// Specify i as part of the regex field
{ $regexFind: { input: "$description", regex: /line/i } }
// Specify i in the options field
{ $regexFind: { input: "$description", regex: /line/, options: "i" } }
{ $regexFind: { input: "$description", regex: "line", options: "i" } }

例如，以下聚合操作对 description 字段执行了大小写不敏感的 $regexFind。正则表达式模式 /line/ 没有指定任何分组

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /line/i } } } }
])

操作返回以下文档

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : { "match" : "LINE", "idx" : 7, "captures" : [ ] } }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

`m` 选项

注意

您不能同时在 regex 和 options 字段中指定选项。

为了匹配多行字符串每行的指定锚点（例如 ^，$），在 m 选项作为 regex 字段或 options 字段的一部分

// Specify m as part of the regex field
{ $regexFind: { input: "$description", regex: /line/m } }
// Specify m in the options field
{ $regexFind: { input: "$description", regex: /line/, options: "m" } }
{ $regexFind: { input: "$description", regex: "line", options: "m" } }

以下示例同时包含 i 和 m 选项，以匹配多行字符串中以字母 s 或 S 开头的行

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /^s/im } } } }
])

操作返回以下结果

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : { "match" : "S", "idx" : 0, "captures" : [ ] } }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "s", "idx" : 12, "captures" : [ ] } }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : null }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : null }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

`x` 选项

注意

您不能同时在 regex 和 options 字段中指定选项。

要忽略模式中的所有未转义的空白字符和注释（由未转义的井号 # 字符和下一个换行符表示），请在 options 字段中包含 s 选项

// Specify x in the options field
{ $regexFind: { input: "$description", regex: /line/, options: "x" } }
{ $regexFind: { input: "$description", regex: "line", options: "x" } }

以下示例包含 x 选项以跳过未转义的空白字符和注释

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex: /lin(e|k) # matches line or link/, options:"x" } } } }
])

操作返回以下结果

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : { "match" : "line", "idx" : 6, "captures" : [ "e" ] } }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : { "match" : "line", "idx" : 23, "captures" : [ "e" ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "line", "idx" : 9, "captures" : [ "e" ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : { "match" : "link", "idx" : 9, "captures" : [ "k" ] } }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

`s` 选项

注意

您不能同时在 regex 和 options 字段中指定选项。

为了允许模式中的点字符（即 .）匹配所有字符，包括换行符，请将 s 选项包含在 options 字段中

// Specify s in the options field
{ $regexFind: { input: "$description", regex: /m.*line/, options: "s" } }
{ $regexFind: { input: "$description", regex: "m.*line", options: "s" } }

以下示例包括 s 选项，允许点字符（即 .）匹配所有字符，包括换行符，以及 i 选项以执行不区分大小写的匹配

db.products.aggregate([
   { $addFields: { returnObject: { $regexFind: { input: "$description", regex:/m.*line/, options: "si"  } } } }
])

操作返回以下结果

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : null }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : null }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : { "match" : "Many spaces before     line", "idx" : 0, "captures" : [ ] } }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : { "match" : "Multiple\nline", "idx" : 0, "captures" : [ ] } }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : null }
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : null }

使用 `$regexFind` 从字符串中解析电子邮件

创建一个包含以下文档的示例集合 feedback

db.feedback.insertMany([
   { "_id" : 1, comment: "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com"  },
   { "_id" : 2, comment: "I wanted to concatenate a string" },
   { "_id" : 3, comment: "How do I convert a date to string? cam@mongodb.com" },
   { "_id" : 4, comment: "It's just me. I'm testing.  fred@MongoDB.com" }
])

以下聚合使用 $regexFind 从 comment 字段中提取电子邮件（不区分大小写）。

db.feedback.aggregate( [
    { $addFields: {
       "email": { $regexFind: { input: "$comment", regex: /[a-z0-9_.+-]+@[a-z0-9_.+-]+\.[a-z0-9_.+-]+/i } }
    } },
    { $set: { email: "$email.match"} }
] )

第一阶段

该阶段使用 $addFields 阶段向文档添加一个新字段 email。新字段包含对 comment 字段执行 $regexFind 的结果

{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : { "match" : "aunt.arc.tica@example.com", "idx" : 38, "captures" : [ ] } }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "email" : null }
{ "_id" : 3, "comment" : "I can't find how to convert a date to string. cam@mongodb.com", "email" : { "match" : "cam@mongodb.com", "idx" : 46, "captures" : [ ] } }
{ "_id" : 4, "comment" : "It's just me. I'm testing.  fred@MongoDB.com", "email" : { "match" : "fred@MongoDB.com", "idx" : 28, "captures" : [ ] } }

第二阶段

该阶段使用 $set 阶段将 email 重置为当前 "$email.match" 的值。如果 email 的当前值为 null，则将 email 的新值设置为 null。

{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : "aunt.arc.tica@example.com" }
{ "_id" : 2, "comment" : "I wanted to concatenate a string" }
{ "_id" : 3, "comment" : "I can't find how to convert a date to string. cam@mongodb.com", "email" : "cam@mongodb.com" }
{ "_id" : 4, "comment" : "It's just me. I'm testing.  fred@MongoDB.com", "email" : "fred@MongoDB.com" }

将 `$regexFind` 应用到数组的字符串元素中

创建一个示例集合 contacts，包含以下文档

db.contacts.insertMany([
   { "_id" : 1, name: "Aunt Arc Tikka", details: [ "+672-19-9999", "aunt.arc.tica@example.com" ] },
   { "_id" : 2, name: "Belle Gium",  details: [ "+32-2-111-11-11", "belle.gium@example.com" ] },
   { "_id" : 3, name: "Cam Bo Dia",  details: [ "+855-012-000-0000", "cam.bo.dia@example.com" ] },
   { "_id" : 4, name: "Fred", details: [ "+1-111-222-3333" ] }
])

以下聚合操作使用 $regexFind 将 details 数组转换为包含 email 和 phone 字段的嵌入式文档

db.contacts.aggregate( [
   { $unwind: "$details" },
   { $addFields: {
      "regexemail": { $regexFind: { input: "$details", regex: /^[a-z0-9_.+-]+@[a-z0-9_.+-]+\.[a-z0-9_.+-]+$/, options: "i" } },
      "regexphone": { $regexFind: { input: "$details", regex: /^[+]{0,1}[0-9]*\-?[0-9_\-]+$/ } }
   } },
   { $project: { _id: 1, name: 1, details: { email: "$regexemail.match", phone: "$regexphone.match" } } },
   { $group: { _id: "$_id", name: { $first: "$name" }, details: { $mergeObjects: "$details"} } },
   { $sort: { _id: 1 } }
])

第一阶段

阶段 $unwinds 将数组展开成单独的文档

{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "+672-19-9999" }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "aunt.arc.tica@example.com" }
{ "_id" : 2, "name" : "Belle Gium", "details" : "+32-2-111-11-11" }
{ "_id" : 2, "name" : "Belle Gium", "details" : "belle.gium@example.com" }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "+855-012-000-0000" }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "cam.bo.dia@example.com" }
{ "_id" : 4, "name" : "Fred", "details" : "+1-111-222-3333" }

第二阶段

该阶段使用 $addFields 阶段向包含 $regexFind 结果的文档中添加新的字段，用于电话号码和电子邮件

{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "+672-19-9999", "regexemail" : null, "regexphone" : { "match" : "+672-19-9999", "idx" : 0, "captures" : [ ] } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : "aunt.arc.tica@example.com", "regexemail" : { "match" : "aunt.arc.tica@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 2, "name" : "Belle Gium", "details" : "+32-2-111-11-11", "regexemail" : null, "regexphone" : { "match" : "+32-2-111-11-11", "idx" : 0, "captures" : [ ] } }
{ "_id" : 2, "name" : "Belle Gium", "details" : "belle.gium@example.com", "regexemail" : { "match" : "belle.gium@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "+855-012-000-0000", "regexemail" : null, "regexphone" : { "match" : "+855-012-000-0000", "idx" : 0, "captures" : [ ] } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : "cam.bo.dia@example.com", "regexemail" : { "match" : "cam.bo.dia@example.com", "idx" : 0, "captures" : [ ] }, "regexphone" : null }
{ "_id" : 4, "name" : "Fred", "details" : "+1-111-222-3333", "regexemail" : null, "regexphone" : { "match" : "+1-111-222-3333", "idx" : 0, "captures" : [ ] } }

第三阶段

该阶段使用 $project 阶段输出包含 _id 字段、name 字段和 details 字段的文档。其中 details 字段被设置为包含 email 和 phone 字段的文档，其值分别由 regexemail 和 regexphone 字段确定。

{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999" } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "email" : "belle.gium@example.com" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }

第四阶段

该阶段使用 $group 阶段按输入文档的 _id 值对它们进行分组。该阶段使用 $mergeObjects 表达式合并 details 文档。

{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000", "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }
{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999", "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11", "email" : "belle.gium@example.com" } }

第五阶段

该阶段使用 $sort 阶段按 _id 字段对文档进行排序。

{ "_id" : 1, "name" : "Aunt Arc Tikka", "details" : { "phone" : "+672-19-9999", "email" : "aunt.arc.tica@example.com" } }
{ "_id" : 2, "name" : "Belle Gium", "details" : { "phone" : "+32-2-111-11-11", "email" : "belle.gium@example.com" } }
{ "_id" : 3, "name" : "Cam Bo Dia", "details" : { "phone" : "+855-012-000-0000", "email" : "cam.bo.dia@example.com" } }
{ "_id" : 4, "name" : "Fred", "details" : { "phone" : "+1-111-222-3333" } }

使用捕获分组解析用户名

创建一个名为 employees 的样本集合，包含以下文档

db.employees.insertMany([
   { "_id" : 1, name: "Aunt Arc Tikka", "email" : "aunt.tica@example.com" },
   { "_id" : 2, name: "Belle Gium", "email" : "belle.gium@example.com" },
   { "_id" : 3, name: "Cam Bo Dia", "email" : "cam.dia@example.com" },
   { "_id" : 4, name: "Fred"  }
])

员工电子邮件格式为 <firstname>.<lastname>@example.com。使用 $regexFind 结果返回的 captured 字段，可以解析员工的用户名。

db.employees.aggregate( [
    { $addFields: {
       "username": { $regexFind: { input: "$email", regex: /^([a-z0-9_.+-]+)@[a-z0-9_.+-]+\.[a-z0-9_.+-]+$/, options: "i" } },
    } },
    { $set: { username: { $arrayElemAt:  [ "$username.captures", 0 ] } } }
] )

第一阶段

该阶段使用 $addFields 阶段向文档添加新的字段 username。新字段包含对 email 字段执行 $regexFind 的结果

{ "_id" : 1, "name" : "Aunt Arc Tikka", "email" : "aunt.tica@example.com", "username" : { "match" : "aunt.tica@example.com", "idx" : 0, "captures" : [ "aunt.tica" ] } }
{ "_id" : 2, "name" : "Belle Gium", "email" : "belle.gium@example.com", "username" : { "match" : "belle.gium@example.com", "idx" : 0, "captures" : [ "belle.gium" ] } }
{ "_id" : 3, "name" : "Cam Bo Dia", "email" : "cam.dia@example.com", "username" : { "match" : "cam.dia@example.com", "idx" : 0, "captures" : [ "cam.dia" ] } }
{ "_id" : 4, "name" : "Fred", "username" : null }

第二阶段

该阶段使用 $set 阶段将 username 重置为 "$username.captures" 数组的零索引元素。如果 username 的当前值为空，则将 username 的新值设置为空。

{ "_id" : 1, "name" : "Aunt Arc Tikka", "email" : "aunt.tica@example.com", "username" : "aunt.tica" }
{ "_id" : 2, "name" : "Belle Gium", "email" : "belle.gium@example.com", "username" : "belle.gium" }
{ "_id" : 3, "name" : "Cam Bo Dia", "email" : "cam.dia@example.com", "username" : "cam.dia" }
{ "_id" : 4, "name" : "Fred", "username" : null }

提示

另请参阅

有关 captures 数组的行为和更多示例，请参阅 captures 输出行为。

$reduce

$regexFindAll

定义

语法

操作字段

返回

提示

另请参阅

行为

PCRE库

$regexFind和排序

captures 输出行为

示例

$regexFind 及其选项

i 选项

注意

m 选项

注意

x 选项

注意

s 选项

注意

使用 $regexFind 从字符串中解析电子邮件

将 $regexFind 应用到数组的字符串元素中

使用捕获分组解析用户名

提示

另请参阅

`$regexFind`和排序

`captures` 输出行为

`$regexFind` 及其选项

`i` 选项

`m` 选项

`x` 选项

`s` 选项

使用 `$regexFind` 从字符串中解析电子邮件

将 `$regexFind` 应用到数组的字符串元素中