【文档数据库】Postgres JSONb与MongoDB相遇。。。

语言 Chinese, Simplified

SEO Title

前言

我听说开发人员不想接受MongoDB是他们操作数据库的可行选择的原因之一是PostgreSQL的JSON支持非常好，没有理由另谋高就。我经常对此感到好奇，并决定为了真正建立任何意见，我需要卷起袖子参与进来。

在这一系列的文章中，我将研究Postgres基于JSON的特定功能（我还不太熟悉），并将它们与MongoDB的JSON功能进行比较（鉴于我在那里的工作，我应该非常了解这些功能）。我希望能学到很多关于Postgres的知识，以及如何使用它，最终形成我自己的观点：现在Postgres支持JSON，还需要MongoDB吗？

我不会把Postgres看作一个RDBMS。对于那些真正想使用20世纪70年代制定的模式对数据进行建模的人来说，Postgres是一个不错的选择。如果你只想通过外键来建模你的关系，并且不介意通过一些脆弱的ORM将所有结果重新映射到JSON中，那么作为我的客人，Postgres无疑是最适合你的RDBMS。

然而，与许多现代开发人员一样，如果您认识到高效构建应用程序和服务需要使用一个存储层，该存储层可以按照我们想要的方式存储数据，那么MongoDB与Postgres的JSONb支持问题是非常有先见之明的。

花了很多时间研究这些帖子，阅读关于JSONb的问题和评论页面，我在Postgres留言板和论坛上一次又一次地看到一个问题出现：“为什么我不应该把所有数据都存储在一个JSONb专栏中？”。如果你熟悉MongoDB，这个问题会让你大吃一惊。

让我们开始吧。

第一部分。基本查询。

在我写这篇文章的时候，这是英国一个温暖晴朗的日子，我坐在郊区小花园的一张桌子旁。我们将首先构建一些在PGJSONb中使用JSON数据的快速示例，然后在MongoDB中比较相同的内容。

对于PG：

create table plants (
    id serial not null primary key,    
    plant jsonb not null
);

insert into plants values (1,'{"name": "Sweet Pea", "currentHeightCM": 82, "maxHeightCM":120, "tags": ["scented", "climber"]}');

insert into plants values (2,'{"name": "Sunflower", "currentHeightCM": 24, "maxHeightCM":240, "tags": ["fast growing", "tall", "phototropic"]}');

你会期望在MongoDB中同样的事情会更容易，当然，作为一个原生的JSON文档（对象）存储，它确实如此。

db.plants.insert({"name": "Sweet Pea", "currentHeightCM": 82, "maxHeightCM":120, "tags": ["scented", "climber"]})
db.plants.insert({"name": "Sunflower", "currentHeightCM": 24, "maxHeightCM":240, "tags": ["fast growing", "tall", "phototropic"]})

您不需要任何创建表定义（一个模式），Postgres定义中的id字段被MongoDB中自动生成的名为_id的objectID所取代。

让我们把这两条记录都拉回来，检查一下它们是否像预期的那样。

jim=# SELECT * FROM plants;
id |                                                       plant                                                      
----+-------------------------------------------------------------------------------------------------------------------
  1 | {"name": "Sweet Pea", "tags": ["scented", "climber"], "maxHeightCM": 120, "currentHeightCM": 82}
  2 | {"name": "Sunflower", "tags": ["fast growing", "tall", "phototropic"], "maxHeightCM": 240, "currentHeightCM": 24}
(2 rows)

jim=# 

garden> db.plants.find()
[
  {
    _id: ObjectId("62add6426e4370b84d1ff2ef"),
    name: 'Sweet Pea',
    currentHeightCM: 82,
    maxHeightCM: 120,
    tags: [ 'scented', 'climber' ]
  },
  {
    _id: ObjectId("62ade2dccc74a951c921497b"),
    name: 'Sunflower',
    currentHeightCM: 24,
    maxHeightCM: 240,
    tags: [ 'fast growing', 'tall', 'phototropic' ]
  }
]
garden>

通过属性查找文档也相当容易。对于MongoDB来说，它只是：

garden> db.plants.find({"name":"Sunflower"})
[
  {
    _id: ObjectId("62ade2dccc74a951c921497b"),
    name: 'Sunflower',
    currentHeightCM: 24,
    maxHeightCM: 240,
    tags: [ 'fast growing', 'tall', 'phototropic' ]
  }
]
garden>

在Postgres中，它仍然是非常基本的东西，尽管因为SQL不是用JSON构建的，所以我们必须引入几个新的操作符之一来处理JSON数据。（附带说明，JSON最近被添加到SQL:2016标准中作为模型，然后更新为SQL:2019作为数据类型，如本文所述，尽管我认为SQL:2019实际上还不存在）

在这种情况下，我们将使用包含运算符@>，它询问短语是否包含在JSONb字段中。这个操作符很有趣，因为它似乎只存在于Postgres中，所以您创建的任何使用它的SQL都不能在MySQL上运行。在这个博客的上下文中，这并不是什么大不了的事情，但我觉得奇怪的是，SQL标准定义了JSON的存储方式，但每个DB都有自己的查询方式。比这个更了解情况的博客也建议在这里使用GIN索引，因为事情变得非常慢，非常快。

jim=# SELECT * FROM plants WHERE plant @> '{"name": "Sunflower"}';
id |                                                       plant                                                      
----+-------------------------------------------------------------------------------------------------------------------
  2 | {"name": "Sunflower", "tags": ["fast growing", "tall", "phototropic"], "maxHeightCM": 240, "currentHeightCM": 24}
(1 row)

好了，我们不需要花了，让我们看看一些树。我的花园很小，但我设法进了不少，我喜欢它们带来的结构和高度。我可以使用JSON轻松地描述树木，而不是复制每棵树的信息，我可以描述一种类型的树，然后在我的花园周围嵌入该树的实例数组。例如，我有三棵苹果树（虽然只有一棵是真正的大树，但今年我靠着栅栏种了另外两棵，它们很小）。我有一个JSON对象，它描述了我需要了解的关于苹果树的信息，它们有多大，它们的花（花）的颜色和结出的果实。

这个对象中嵌入了一组子对象，这些子对象描述了这棵树在我的花园中的每个实例的位置、它的健康程度以及它的大小

{
    name: 'Corkscrew Willow',
    species: 'Salix matsudana Tortuosa',
    flowerColor: 'Yellow Catkins',
    maxHeightCM: 1200,
    instance: [
      {
        location: 'left fence',
        currentHeightCM: 400,
        health: 'vigorous'
      },
      {
        location: 'left fence',
        currentHeightCM: 420,
        health: 'vigorous'
      }
    ],
    tags: [ 'tree', 'twisted' ]
  }

这种类型的数据建模仍然是非规范化的，但关系建模的爱好者会认为，我可以将实例放入一个单独的表中，并在查询时将它们重新连接在一起。

是的，我可以，但为什么？JOIN很昂贵，ORM很脆弱，需要维护。我更愿意在我的应用程序将使用数据的状态下将数据返回到我的API，而数据库所做的工作量最小。（尽管我承认将API绑定到DB的模式从来都不是一个好主意，但它适用于本例。）

因此，这是我们将加载到数据库中的内容（我们将首先删除以前的数据）：

db.plants.drop()

db.plants.insertMany([
  {"name":"Apple Tree","species":"Malus domestica","flowerColor":"White","maxHeightCM":500,"fruit":{"color":"green","avSizeCM":10,"taste":"sweet"},"instance":[{"location":"right fence","currentHeightCM":83,"health":"good"},{"location":"right fence","currentHeightCM":65,"health":"poor"},{"location":"back left corner","currentHeightCM":450,"health":"vigorous"}],"tags":["tree","fruiting","slow growing"]},
  {"name":"Wild Cherry","species":"Prunus avium","flowerColor":"Pink","maxHeightCM":3000,"fruit":{"color":"red","avSizeCM":2,"taste":"sour"},"instance":[{"location":"next to decking","currentHeightCM":600,"health":"vigorous"}],"tags":["tree","fruiting"]},
  {"name":"Silver Birch","species":"Betula pendula","flowerColor":"Yellow Catkins","maxHeightCM":3100,"instance":[{"location":"next to decking","currentHeightCM":140,"health":"good"}],"tags":["tree"]},
  {"name":"Rowan","species":"Sorbus aucuparia","flowerColor":"White","maxHeightCM":1500,"fruit":{"color":"red","avSizeCM":0.5,"taste":"inedible"},"instance":[{"location":"back door","currentHeightCM":400,"health":"good"}],"tags":["tree","Ward Off Evil Spirits"]},
  {"name":"Olive","species":"Olea europaea","flowerColor":"White","maxHeightCM":1500,"fruit":{"color":"green","avSizeCM":1.5,"taste":"sweet"},"instance":[{"location":"back door","currentHeightCM":450,"health":"good"}],"tags":["tree","Mediterranean"]},
  {"name":"Corkscrew Willow","species":"Salix matsudana Tortuosa","flowerColor":"Yellow Catkins","maxHeightCM":1200,"instance":[{"location":"left fence","currentHeightCM":400,"health":"vigorous"},{"location":"left fence","currentHeightCM":420,"health":"vigorous"}],"tags":["tree","twisted"]},
])

Postgres不太容易插入多个JSONb对象，StackOverflow认为*如果插入的次数不多，性能就会出现问题。人们认为，最好的方法是创建一个临时表，将JSON的原始文本加载到其中，然后作为第二步将其解析为所需的表/状态。在我们的情况下，我们可以直接这样做，因为我们只有六条记录。（*公平地说，SO关心的可能更多的是将JSON批量解析为非JSON结构）

TRUNCATE plants;

INSERT INTO plants (plant) 
SELECT * FROM json_array_elements ('[
  {"name":"Apple Tree","species":"Malus domestica","flowerColor":"White","maxHeightCM":500,"fruit":{"color":"green","avSizeCM":10,"taste":"sweet"},"instance":[{"location":"right fence","currentHeightCM":83,"health":"good"},{"location":"right fence","currentHeightCM":65,"health":"poor"},{"location":"back left corner","currentHeightCM":450,"health":"vigorous"}],"tags":["tree","fruiting","slow growing"]},
  {"name":"Wild Cherry","species":"Prunus avium","flowerColor":"Pink","maxHeightCM":3000,"fruit":{"color":"red","avSizeCM":2,"taste":"sour"},"instance":[{"location":"next to decking","currentHeightCM":600,"health":"vigorous"}],"tags":["tree","fruiting"]},
  {"name":"Silver Birch","species":"Betula pendula","flowerColor":"Yellow Catkins","maxHeightCM":3100,"instance":[{"location":"next to decking","currentHeightCM":140,"health":"good"}],"tags":["tree"]},
  {"name":"Rowan","species":"Sorbus aucuparia","flowerColor":"White","maxHeightCM":1500,"fruit":{"color":"red","avSizeCM":0.5,"taste":"inedible"},"instance":[{"location":"back door","currentHeightCM":400,"health":"good"}],"tags":["tree","Ward Off Evil Spirits"]},
  {"name":"Olive","species":"Olea europaea","flowerColor":"White","maxHeightCM":1500,"fruit":{"color":"green","avSizeCM":1.5,"taste":"sweet"},"instance":[{"location":"back door","currentHeightCM":450,"health":"good"}],"tags":["tree","Mediterranean"]},
  {"name":"Corkscrew Willow","species":"Salix matsudana Tortuosa","flowerColor":"Yellow Catkins","maxHeightCM":1200,"instance":[{"location":"left fence","currentHeightCM":400,"health":"vigorous"},{"location":"left fence","currentHeightCM":420,"health":"vigorous"}],"tags":["tree","twisted"]}
]');

让我们做一些非常简单的查询。

我有多少棵树？
哪些类型的树开白花？
哪些树结出甜美的果实
我花园里哪棵树最高？
我有多少棵超过2米高的树？

我有多少棵树？

如果我想知道我有多少类型的树，也就是行的数量，这可能是一个非常简单的查询。但是，我想知道我总共有多少棵树，所以我需要计算实例数组的所有元素。

事实上这是MongoDB中一个非常简单的操作，因为它是为这类工作从头开始设计的。

db.plants.aggregate([
   {$unwind: {path: '$instance'}}, 
   {$count: 'TreeCount'}
])

我们首先展开数组，将元素打开到它们自己的文档中，然后计算所有结果。

[ { TreeCount: 9 } ]

对于Postgres来说，过程是不同的…

SELECT SUM(jsonb_array_length(plant -> 'instance')) FROM plants;

或者，有人建议我，这也会起作用…

SELECT COUNT(plant) FROM plants AS s, jsonb_array_elements(s.plant ->'instance') AS t;

如果你不知道PostgresJSONb操作符，那么解释这里发生了什么就有点困难了，但这里的关键是用于提取JSON对象的->操作符。

两个选项都给出了正确的结果：

sum
-----
9
(1 row)

哪些类型的树开白花？

这对两个平台来说都是另一个简单的操作，对于Postgres，我们使用@>包含运算符，它看起来与左侧选择中运算符右侧的JSON匹配。

SELECT plant['name'] FROM plants WHERE plant @> '{"flowerColor":"White"}';plant

--------------

"Apple Tree"

"Rowan"

"Olive"

(3 rows)

此查询中使用了MongoDB的find（）运算符。

db.plants.find({'flowerColor': 'White'},{_id:0,name:1})[
  { name: 'Apple Tree' },
  { name: 'Rowan' },
  { name: 'Olive' }
]

在上面的MongoDB查询中，我还包含了一个投影参数{_id:0，name:1}，它禁止在结果集中返回_id字段，包含name:1意味着我将传入一个我确实希望返回的字段列表，在本例中仅返回name。它完全是可选的，省略它以返回完整的文档。

哪些树结出甜美的果实？

这是一个有趣的问题，因为我花园里的一些树根本没有结果，所以JSON对象不包含水果字段。这对MongoDB来说不是问题，查询再简单不过了…

db.plants.find({'fruit.taste': 'sweet'},{_id:0,name:1})[
  { name: 'Apple Tree' },
  { name: 'Olive' }
]

在这里，我们正在做一个简单的查找，其余的由MongoDB完成，它不担心某些文档中没有水果对象，而且无论嵌套有多深，都可以很容易地使用嵌套对象。

然而，Postgres的情况就不同了。我承认我不是Postgres的超级明星，但我的SQL并不太生疏。我应该能够做到这一点…

首先，我尝试了类似于MongoDB的查询，我认为它会起作用，因为它看起来很合乎逻辑…

SELECT * FROM plants WHERE plant @> '{"fruit.taste": "sweet"}';

这应该有效，它被要求返回包含{"fruit.taste": "sweet"}的所有行，有趣的是，它没有出错，只是返回（0行）

啊，也许Postgres不理解JSONb嵌套上下文中的点表示法，这很公平。

SELECT * FROM plants WHERE plant @> '{"taste": "sweet"}';

{"taste":"sweet"} 必须包含在JSONb结构中的某个位置，对吗？nope，（0行）

好吧，也许它不是递归的，它需要指向子对象中属性的位置，但我们不能，因为有些行根本没有水果子对象！

然后我发现了一个著名的Postgres支持团队的博客文章，其中介绍了各种紧密的解决方案。他们提出的一个选项是将JSONb转换为文本，然后再转换回JSON，这似乎是一个非常冗长的选项…

select id, t.* from plants, jsonb_to_recordset((plant->>'fruit'::text)::jsonb) as t(color text,taste varChar(100), avSizeCM text) where taste like 'sweet';

这对我来说仍然不起作用。他们在博客文章中提供的其他选项真的很复杂，而且他们自己也承认，性能不好，使用了生成的列和表达式索引以及其他我不完全理解的东西。

最后，在r/postgreSQL中四处询问后，我找到了一条与许多Postgres爱好者流行的不和通道。我解释了这个问题，值得赞扬的是，他们像对待公牛一样接受了挑战。过了一段时间，解决方案发布了！

with fruit_cte as
(
select plant->>'name' as name,
plant->'fruit' as fruit
from plants
)
select name,fruit->>'taste' as taste
from fruit_cte
where fruit->>'taste' = 'sweet';
name        | taste 
------------+-------
 Apple Tree | sweet
 Olive      | sweet
(2 rows)

（好吧，我知道橄榄不是很甜，但它们也不酸，我正在制作这个数据集，所以不要根据我的水果定义来挑剔我！）

我们终于到了！我花了两天的时间进行实验，尝试不同的东西，用头撞墙，才放弃。即使是前面提到的备受尊敬的专业支持团队也找不到这个解决方案，这取决于一些非常有才华的以社区为导向的人，他们喜欢解决棘手的挑战，能够不假思索地在MongoDB中完成一些事情。

我花园里哪棵树最高？

我花园里最高的树不是maxHeightCM最高的树，而是instance.currentHightCM最高的树。

在MongoDB中实现这一点的一种方法是使用聚合管道，像这样的查询显示了它是多么容易使用。（如果你想看看它有多神奇，请查看@thedonester的权威文本）

db.plants.aggregate([
  {$unwind: {path: "$instance"}}, 
  {$sort: {"instance.currentHeightCM" :-1}}, 
  {$limit: 1}, 
  {$project: {
    _id:0,
    name:1,
    "location":"$instance.location",
    "height":"$instance.currentHeightCM"
  }}
])

聚合由多个阶段组成，第一阶段的输出通过管道连接到第二阶段的输入

第一个阶段是展开，这是处理数组的一个非常方便的工具，因为它允许您将数组分解并为每个元素创建一个单独的文档。

接下来，我们对数组进行排序，在limit阶段之前，它只返回一个文档。

最后，我将在项目阶段运行结果集，以重塑数据。这是可选的，但如果不打算使用所有原始文档，您可能不希望将其返回到应用程序。

您还将注意到，我已经使用投影阶段重命名了几个属性，这只是为了可读性。

它会产生正确的结果：

[{ name: 'Wild Cherry', location: 'next to decking', height: 600 }]

还有其他方法可以做到这一点，有些甚至更简单，它们不使用聚合框架，但这个例子很好地介绍了使用管道进行处理的能力。

我们如何在Postgres中做到这一点？第一项工作是列出所有的树。

SELECT plant->>'name' AS name, plant->'instance' AS instance FROM plants

输出如下所示：

       name       |                                                                                                           instance                                                                                                          
------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Apple Tree       | [{"health": "good", "location": "right fence", "currentHeightCM": 83}, {"health": "poor", "location": "right fence", "currentHeightCM": 65}, {"health": "vigorous", "location": "back left corner", "currentHeightCM": 450}]
Wild Cherry      | [{"health": "vigorous", "location": "next to decking", "currentHeightCM": 600}]
Silver Birch     | [{"health": "good", "location": "next to decking", "currentHeightCM": 140}]
Rowan            | [{"health": "good", "location": "back door", "currentHeightCM": 400}]
Olive            | [{"health": "good", "location": "back door", "currentHeightCM": 450}]
Corkscrew Willow | [{"health": "vigorous", "location": "left fence", "currentHeightCM": 400}, {"health": "vigorous", "location": "left fence", "currentHeightCM": 420}]
(6 rows)

在下一阶段，我大量借鉴了上面关于甜水果的例子。如果没有乐于助人的Postgres社区的支持，我不可能做到这一点

现在，我们需要在该结果集上运行SELECT，但使用一个名为jsonb_to_recordset（）的新（对我们来说）运算符，该运算符会分解数组并为每个数组元素创建一行。它相当于Postgres的MongoDB的$unver，但是，对于Postgres，您必须使用AS条件显式地强制转换jsonb_to_recordset（）的输出。

with instance_cte as
(
  SELECT plant->>'name' AS name,
  plant->'instance' AS instance
  FROM plants
)
SELECT * FROM instance_cte, jsonb_to_recordset(instance_cte.instance) 
AS items(health jsonb, location jsonb, currentHeightCM jsonb);

我们还没到。结果集如下所示：

       name       |                                                                                                           instance                                                                                                           |   health   |      location      | currentheightcm 
------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+--------------------+-----------------
 Apple Tree       | [{"health": "good", "location": "right fence", "currentHeightCM": 83}, {"health": "poor", "location": "right fence", "currentHeightCM": 65}, {"health": "vigorous", "location": "back left corner", "currentHeightCM": 450}] | "good"     | "right fence"      | 
 Apple Tree       | [{"health": "good", "location": "right fence", "currentHeightCM": 83}, {"health": "poor", "location": "right fence", "currentHeightCM": 65}, {"health": "vigorous", "location": "back left corner", "currentHeightCM": 450}] | "poor"     | "right fence"      | 
 Apple Tree       | [{"health": "good", "location": "right fence", "currentHeightCM": 83}, {"health": "poor", "location": "right fence", "currentHeightCM": 65}, {"health": "vigorous", "location": "back left corner", "currentHeightCM": 450}] | "vigorous" | "back left corner" | 
 Wild Cherry      | [{"health": "vigorous", "location": "next to decking", "currentHeightCM": 600}]                                                                                                                                              | "vigorous" | "next to decking"  | 
 Silver Birch     | [{"health": "good", "location": "next to decking", "currentHeightCM": 140}]                                                                                                                                                  | "good"     | "next to decking"  | 
 Rowan            | [{"health": "good", "location": "back door", "currentHeightCM": 400}]                                                                                                                                                        | "good"     | "back door"        | 
 Olive            | [{"health": "good", "location": "back door", "currentHeightCM": 450}]                                                                                                                                                        | "good"     | "back door"        | 
 Corkscrew Willow | [{"health": "vigorous", "location": "left fence", "currentHeightCM": 400}, {"health": "vigorous", "location": "left fence", "currentHeightCM": 420}]                                                                         | "vigorous" | "left fence"       | 
 Corkscrew Willow | [{"health": "vigorous", "location": "left fence", "currentHeightCM": 400}, {"health": "vigorous", "location": "left fence", "currentHeightCM": 420}]                                                                         | "vigorous" | "left fence"       | 
(9 rows)

它返回的信息比我们实际需要的要多得多，所以我们将在下一步进行投影，但至少每个实例有一行，而不是每棵树有一行。

但有些可疑的事情正在发生，如果你看看在最右边专栏，对于currentHeightCM来说，它是空的，这是一个遗憾，因为它是我们将需要的。我们也必须解决这个问题。

让我们把我们不想要的东西都投射出来。

with instance_cte as
(
  SELECT plant->>'name' AS name,
  plant->'instance' AS instance
  FROM plants
)
SELECT instance_cte.name, items.location, items.currentheightcm  FROM instance_cte, jsonb_to_recordset(instance_cte.instance) AS items(location jsonb, currentHeightCM jsonb);

now we are being more explicit about what we want to return, we can see the problem clearly.

name              |      location      | currentheightcm 
------------------+--------------------+-----------------
 Apple Tree       | "right fence"      | 
 Apple Tree       | "right fence"      | 
 Apple Tree       | "back left corner" | 
 Wild Cherry      | "next to decking"  | 
 Silver Birch     | "next to decking"  | 
 Rowan            | "back door"        | 
 Olive            | "back door"        | 
 Corkscrew Willow | "left fence"       | 
 Corkscrew Willow | "left fence"       | 
(9 rows)

currentHeightCM出了什么问题？它是子对象中唯一一个int而不是text的字段，但这不可能是问题所在（我试着将其转换为int，但仍然不起作用）。

几个小时过去了。

为什么它将结果集中的列名转换为小写？不，等等，这是我写为Lower Camel Case的唯一属性名称。不可能，那太疯狂了。为什么Postgres的属性名称必须全部小写？为了确定起见，我最好测试一下。

我创建了一个名为plants3的新表，并插入了与以前相同的数据，但将所有对currentHeightCM的引用都更改为currentHeightCM，然后重新运行与上面完全相同的查询。

       name       |      location      | currentheightcm 
------------------+--------------------+-----------------
 Apple Tree       | "right fence"      | 83
 Apple Tree       | "right fence"      | 65
 Apple Tree       | "back left corner" | 450
 Wild Cherry      | "next to decking"  | 600
 Silver Birch     | "next to decking"  | 140
 Rowan            | "back door"        | 400
 Olive            | "back door"        | 450
 Corkscrew Willow | "left fence"       | 400
 Corkscrew Willow | "left fence"       | 420
(9 rows)

这正是使用连接到RDBMS上的一个好主意（JSONb）来处理这种工作负载的问题。无论你认为数据建模的灵活性有多大，总会有一些令人讨厌的陷阱潜伏在那里，作为关系模式僵化的后遗症。

我后来发现了一份关于这个主题的错误报告，报告者建议他们不能将camelCase与jsonb_to_recordSet（）一起使用，然而，受访者认为这是故意的行为…。

未引用的标识符在postgres中用小写字母表示

在我们因沮丧而死之前，让我们先完成剩下的查询。

我需要ORDERBY和LIMIT来匹配MongoDB的输出。

with instance_cte as
(
  SELECT plant->>'name' AS name,
  plant->'instance' AS instance
  FROM plants3
)
SELECT instance_cte.name, items.location, items.currentheightcm FROM instance_cte, jsonb_to_recordset(instance_cte.instance) AS items(location jsonb, currentheightcm jsonb) ORDER BY items.currentheightcm DESC LIMIT 1;

and at last…

name         |     location      | currentheightcm 
-------------+-------------------+-----------------
 Wild Cherry | "next to decking" | 600
(1 row)

我本应该将高度强制转换为int，但我很高兴看到Postgres仍然可以对其进行排序，即使它被强制转换为JSONb。对于下一个查询，我肯定要这样做。此外，我不确定为什么location是唯一一个用双引号括起来的字段。

我有多少棵超过2米高的树？

这里的计划是在上一个查询的基础上添加一个范围匹配。我怀疑这对两个平台来说都很容易。

对于MongoDB，查询比以前更简单：

db.plants.aggregate([
  {$unwind: {path: "$instance",}}, 
  {$match: {"instance.currentHeightCM":{$gt: 200}}}, 
  {$count: 'Trees_GT_2m'}
])

and the output is:

[{ Trees_GT_2m: 6 }]

For Postgres, I’m going to start with my previous query, this time casting the height into an int and drop a WHERE clause and COUNTin there.

with instance_cte as
(
  SELECT plant->>'name' AS name,
  plant->'instance' AS instance
  FROM plants3
)
SELECT COUNT(*) FROM instance_cte, jsonb_to_recordset(instance_cte.instance) AS items( location jsonb, currentheightcm int) WHERE items.currentheightcm > 200;

And the result…

count 
-------
     6
(1 row)

这个没有什么可补充的，现在我们已经完成了Postgres查询的所有繁重工作，一切都很顺利。我仍然建议，如果并排比较这两个查询，MongoDB查询看起来更干净、直观，而且更易于阅读。

结论

在第一部分中，我们研究了在Postgres中查询JSONb，并与MongoDB进行了比较。Postgres中对JSONb（和JSON）运算符的支持看起来很棒，但单独使用文档来告知您的可用性和实用性是危险的。实际情况是，一旦你走出了一个相当狭窄的简单查询范围，构建任何远程复杂的东西（例如Postgress应该能够进行的实时分析的分组或聚合）都会变得不直观且耗时。我很容易遇到一些问题，这些问题只有那些最聪明、渴望益智的Postgres超级用户才能解决（他们手头有时间）。如果我在构建、运行和维护生产服务，我会非常担心这一点。

我认为不可否认的是，如果你是现代开发社区的一员，需要使用JSON格式的数据作为你的应用程序和服务的主要来源，那么使用MongoDB而不是Postgres会更有效率。

在下一部分中，我们将增加数据量并重新运行一些查询，以了解MongoDB Postgres JSONb在更大范围内的表现。

本文地址

https://architect.pub/postgres-jsonb-meets-mongodb

适合打印版本
登录发表评论
62 次浏览

发布日期

星期六, 八月 26, 2023 - 11:58

最后修改

星期三, 三月 20, 2024 - 09:00

【文档数据库】Postgres JSONb与MongoDB相遇。。。

category

前言

第一部分。基本查询。

我有多少棵树？

哪些类型的树开白花？

哪些树结出甜美的果实？

我花园里哪棵树最高？

我有多少棵超过2米高的树？

结论

Tags

最新内容

Content type

Content type

Tags

Tags

category

category

Tags