文档首页 → 开发应用程序 → Python 驱动 → PyMongo

一对一连接

本页内容

简介

聚合任务摘要
开始之前
教程
为2020年的订单添加匹配阶段
添加查找阶段以链接集合
添加设置阶段以创建新的文档字段
添加未设置阶段以删除不需要的字段
运行聚合管道
解释结果

简介

在本教程中，您可以学习如何使用 PyMongo 构建聚合管道，在集合上执行聚合，并通过完成和运行示例应用程序来打印结果。

此聚合执行一对一连接。一对一连接发生在某个集合中的一个文档的字段值与另一个集合中具有相同字段值的单个文档匹配时。聚合通过字段值匹配这些文档，并将来自两个源的信息合并到单个结果中。

提示

一对一连接不需要文档具有一对一关系。要了解更多关于这种数据关系的信息，请参阅维基百科上关于一对一（数据模型）的条目.

聚合任务摘要

本教程演示了如何将描述产品信息的集合数据与另一个描述客户订单的集合数据进行结合。结果展示了一个包含每个订单相关产品详情的2020年所有订单列表。

本例使用两个集合

orders：包含描述商店中产品单个订单的文档
products：包含描述商店所销售产品的文档

一个订单只能包含一个产品，因此聚合使用一对一连接来匹配订单文档到产品文档。这两个集合通过一个称为 product_id 的字段进行连接，该字段存在于两个集合的文档中。

开始之前

在开始本教程之前，请完成聚合模板应用程序的说明，以设置一个可工作的Python应用程序。

设置应用程序后，通过在应用程序中添加以下代码来访问 orders 和 products 集合

orders_coll = agg_db["orders"]
products_coll = agg_db["products"]

删除任何现有数据，并按照以下代码将样本数据插入到 orders 集合中

orders_coll.delete_many({})
order_data = [
    {
        "customer_id": "elise_smith@myemail.com",
        "orderdate": datetime(2020, 5, 30, 8, 35, 52),
        "product_id": "a1b2c3d4",
        "value": 431.43
    },
    {
        "customer_id": "tj@wheresmyemail.com",
        "orderdate": datetime(2019, 5, 28, 19, 13, 32),
        "product_id": "z9y8x7w6",
        "value": 5.01
    },
    {
        "customer_id": "oranieri@warmmail.com",
        "orderdate": datetime(2020, 1, 1, 8, 25, 37),
        "product_id": "ff11gg22hh33",
        "value": 63.13
    },
    {
        "customer_id": "jjones@tepidmail.com",
        "orderdate": datetime(2020, 12, 26, 8, 55, 46),
        "product_id": "a1b2c3d4",
        "value": 429.65
    }
]
orders_coll.insert_many(order_data)

删除任何现有数据，并按照以下代码将样本数据插入到 products 集合中

products_coll.delete_many({})
product_data = [
    {
        "id": "a1b2c3d4",
        "name": "Asus Laptop",
        "category": "ELECTRONICS",
        "description": "Good value laptop for students"
    },
    {
        "id": "z9y8x7w6",
        "name": "The Day Of The Triffids",
        "category": "BOOKS",
        "description": "Classic post-apocalyptic novel"
    },
    {
        "id": "ff11gg22hh33",
        "name": "Morphy Richardds Food Mixer",
        "category": "KITCHENWARE",
        "description": "Luxury mixer turning good cakes into great"
    },
    {
        "id": "pqr678st",
        "name": "Karcher Hose Set",
        "category": "GARDEN",
        "description": "Hose + nosels + winder for tidy storage"
    }
]
products_coll.insert_many(product_data)

教程

添加匹配阶段以匹配2020年的订单

添加一个 $match 阶段，匹配2020年下发的订单

pipeline.append({
    "$match": {
        "orderdate": {
            "$gte": datetime(2020, 1, 1, 0, 0, 0),
            "$lt": datetime(2021, 1, 1, 0, 0, 0)
        }
    }
})

添加链接阶段以连接集合

接下来，添加一个 $lookup 阶段。该 $lookup 阶段将 orders 集合中的 product_id 字段与 products 集合中的 id 字段进行连接

pipeline.append({
    "$lookup": {
        "from": "products",
        "localField": "product_id",
        "foreignField": "id",
        "as": "product_mapping"
    }
})

添加设置阶段以创建新的文档字段

接下来，向管道中添加两个 $set 阶段。

第一个 $set 阶段将 product_mapping 字段设置为前一个 $lookup 阶段中创建的 product_mapping 对象的第一个元素。

第二个 $set 阶段根据 product_mapping 对象的字段值创建两个新字段，product_name 和 product_category。

pipeline.extend([
    {
        "$set": {
            "product_mapping": {"$first": "$product_mapping"}
        }
    },
    {
        "$set": {
            "product_name": "$product_mapping.name",
            "product_category": "$product_mapping.category"
        }
    }
])

提示

因为这是一个一对一连接，所以 $lookup 阶段仅将一个数组元素添加到输入文档中。管道使用 $first 操作符从该元素检索数据。

添加一个删除不需要的字段

最后，添加一个 $unset 阶段。该 $unset 阶段从文档中删除不必要的字段

pipeline.append({"$unset": ["_id", "product_id", "product_mapping"]})

运行聚合管道

将以下代码添加到您的应用程序末尾以在 orders 集合上执行聚合

aggregation_result = orders_coll.aggregate(pipeline)

最后，在您的 shell 中运行以下命令以启动您的应用程序

python3 agg_tutorial.py

解释结果

聚合结果包含三个文档。这些文档代表在 2020 年发生的客户订单，其中包含已订购产品的 product_name 和 product_category

{
  'customer_id': 'elise_smith@myemail.com',
  'orderdate': datetime.datetime(2020, 5, 30, 8, 35, 52),
  'value': 431.43,
  'product_name': 'Asus Laptop',
  'product_category': 'ELECTRONICS'
}
{
  'customer_id': 'oranieri@warmmail.com',
  'orderdate': datetime.datetime(2020, 1, 1, 8, 25, 37),
  'value': 63.13,
  'product_name': 'Morphy Richardds Food Mixer',
  'product_category': 'KITCHENWARE'
}
{
  'customer_id': 'jjones@tepidmail.com',
  'orderdate': datetime.datetime(2020, 12, 26, 8, 55, 46),
  'value': 429.65,
  'product_name': 'Asus Laptop',
  'product_category': 'ELECTRONICS'
}

结果由包含来自 orders 集合和 products 集合的文档的字段组成，通过匹配每个原始文档中存在的 product_id 字段进行连接。

要查看此教程的完整代码，请参阅 GitHub 上的完成的单向连接应用程序。

← 解包数组和分组

多字段连接 →

一对一连接.css-134mg1q{-webkit-align-self:center;-ms-flex-item-align:center;align-self:center;padding:0 10px;visibility:hidden;}.css-6vrlzm{border-radius:0!important;display:initial!important;margin:initial!important;}.css-1l4s55v{margin-top:-175px;position:absolute;padding-bottom:2px;}

简介

提示