MongoDB(二)聚合查询

什么是MongoDB聚合框架

  • MongoDB聚合框架(Aggregation Framework)是一个计算框架,它可以:
    • 作用在一个或几个集合上
    • 对集合中的数据进行一系列运算
    • 将这些数据转化为期望的形式
  • 从效果而言,聚合框架相当于SQL查询中的:
    • GROUP BY
    • LEF OUTER JOIN
    • AS 等

管道(Pipeline)和步骤(Stage)

  • 整个聚合运算过程称为管道,他由多个步骤组成的,每个管道:
    • 接收一系列文档(原始数据)
    • 每个步骤对这些文档进行一系列运算
    • 结果文档输出给下一个步骤

image.png

聚合运算的基本格式

pipeline = [$stage1, $stage2, .... $stageN]

db.<集合>.aggregate(
    pipeline,
    {options}
);

常见步骤

步骤 作用 SQL等价运算符
$match 过滤 WHERE
$project 投影 AS
$sort 排序 ORDER BY
$group 分组 GROUP BY
\$skip/$limit 结果限制 SKIP/LIMIT
\$lookup 左外连接 LEFT OUTER JOIN
\$unwind 展开数组 N/A
\$graphLookup 图搜索 N/A
\$facet/\$bucket 分片搜索 N/A

常见步骤中的运算符

\$match \$project \$group
\$eq/\$gt/\$gte/\$lt/\$lte 选择需要的或排除不需要的字段 \$sum/\$avg
\$and/\$or/\$not/\$in \$map/\$reduce/\$filter \$push/\$addToSet
\$geoWithin/$intersect \$range \$max/\$min/\$last/\$first
….. \$multiply/\$divide/\$substract/\$add …..
….. \$year/\$month/\$dayOfMonth/\$hour/\$minute/\$second …..
….. ….. …..

MQL常用步骤和SQL对比

例1
SELECT 
    FIRST_NAME AS "名",
    LAST_NAME AS "姓"
FROM USERS
WHERE GENDER = "男"
SKIP 100
LIMIT 20

转换为MQL则为以下

db.users.aggregate([
    {$match: {gender: "男"}},
    {$skip: 100},
    {$limit: 20},
    {$project: {
        "名": "$first_name",
        "姓": "$last_name"
    }}
])
例2
SELECT 
    DEPARTMENT,
    COUNT(NULL) AS EMP_QTY
FROM USERS
WHERE GENDER = "女"
GROUP BY DEPARTMENT HAVING COUNT(*) < 10

转换为MQL则为以下:

db.users.aggregate([
    {$match: {gender: "女"}},
    {$group: {
        _id: "$DEPARTMENT",
        emp_qty: {$sum: 1}
    }},
    {$match: {emp_qty: {$lt: 10}}}
])

MQL特有步骤

$unwind

> db.students.find().pretty();
{
    "_id" : ObjectId("5ec113222c16c3352e6ff156"),
    "name" : "张三",
    "score" : [
        {
            "subject" : "语文",
            "score" : 84
        },
        {
            "subject" : "数学",
            "score" : 90
        },
        {
            "subject" : "外语",
            "score" : 69
        }
    ]
}

//使用聚合查询(展开score数组)
> db.students.aggregate([{$unwind: '$score'}]).pretty();
{
    "_id" : ObjectId("5ec113222c16c3352e6ff156"),
    "name" : "张三",
    "score" : {
        "subject" : "语文",
        "score" : 84
    }
}
{
    "_id" : ObjectId("5ec113222c16c3352e6ff156"),
    "name" : "张三",
    "score" : {
        "subject" : "数学",
        "score" : 90
    }
}
{
    "_id" : ObjectId("5ec113222c16c3352e6ff156"),
    "name" : "张三",
    "score" : {
        "subject" : "外语",
        "score" : 69
    }
}

$bucket

假如以下场景,需要统计各个区间的文档数量

image.png

db.products.aggregate([{
    $bucket: {
        //按照price进行分组
        groupBy: "$price",
        //分组区间
        boundaries: [0, 10, 20, 30, 40],
        default: "Other",
        output: {"count": {$sum: 1}}
    }
}])

$facet

多分组
image.png

db.products.aggregate([{
    $facet: {
        price: {
            $bucket: {...}
        },
        year: {
            $bucket: {...}
        }
    }
}]);

聚合实验

实验1: 获取订单总销量

计算到目前为止的所有订单的总销售额

> db.orders.aggregate([
    {
        $group: {
            //表示没有分组字段
            _id: null,
            total: {$sum: "$total"}
        }
    }
  ])

{ "_id" : null, "total" : NumberDecimal("44019609") }

实验2: 订单金额汇总

查询2019年第一季度(1月1日~3月31日)已完成订单(completed)的订单总金额和订单总数

> db.orders.find([
{
    $match: {
      status: "completed",
      orderDate: {
          $gte: ISODate("2019-01-01"),
          $lt: ISODate("2019-03-31")
}}}, 
{
    $group: {
      _id: null,
      total: {
        $sum: "$total"
      },
      shippingFee: {
        $sum: "$shippingFee"
      },
      count: {
        $sum: 1
      }
}}, 
{
    $project: {
      grandTotal: {
        "$add": ["$total", "$shippingFee"]
      },
      _id: 0,
      count: 1
}}])

发表评论