当前位置:网站首页>Masterless Replication System (3)-Limitations of Quorum Consistency
Masterless Replication System (3)-Limitations of Quorum Consistency
2022-07-31 16:39:00 【HUAWEI CLOUD】
若有n个副本,且配置w和r,使得,Expect to read an up-to-date value.Because the successfully written node set and the read node set must coincide,At least one of the nodes thus read has the latest value,如图-11.
一般设定r和w为简单多数(超过n/2)节点,即可确保 w + r> n,且同时容忍多达 n/2 个节点故障.但是,A quorum does not necessarily have to be a majority,It's just that the node intersection for read and write use needs to include at least one node.Other quorum configurations are possible,This allows some flexibility in the design of distributed algorithms.
您也可以将w和r设置为较小的数字,以使(That is, the statutory conditions are not met).在这种情况下,Read and write operations will still be sent ton个节点,But only a small number of successful responses are required for the operation to succeed.
较小的w和rMore likely to read stale data,Because your read is more likely not to contain the node with the latest value.另一方面,This configuration allows for lower latency and higher availability:If there is a network outage,And many copies became inaccessible,There is a greater chance that reads and writes can continue to be processed.Only if the number of reachable replicas is beloww或r时,The database becomes unavailable for writing or reading, respectively.
但是,即使在的情况下,也可能存在返回陈旧值的边缘情况.这取决于实现,But possible situations include:
- If a relaxed quorum is used,wwrite andrreads fall on completely different nodes,因此r节点和wOverlapping nodes are no longer guaranteed between【46】.
- If two writes happen at the same time,不清楚哪一个先发生.在这种情况下,唯一安全的解决方案是合并并发写入(See Handling Write Conflicts).如果根据时间戳(最后写入胜利)Pick a winner,then due to clock skew[35],写入可能会丢失
- 如果写操作与读操作同时发生,Write operations may only be reflected on some replicas.在这种情况下,Not sure if the read returns the old value or the new value.
- If the write operation was successful on some replicas,And fails on other nodes(例如,Because the disk on some nodes is full),在小于wThe write on the replica was successful.Therefore, the overall judgment is that the write fails,But overall write failures are not rolled back on the write-successful replica.This means if a write fails though report,Subsequent reads may still read the value of this failed write【47】.
- If the node carrying the new value fails,Other replicas with old values need to be read.And its data is restored from the copy with the old value,then the number of replicas storing the new value may be lowerw,thereby breaking the quorum condition.
- 即使一切工作正常,Occasionally, unfortunately, about时序(timing) 的边缘情况
因此,Although quorum seems to guarantee that reads return the latest written value,But in practice it's not that simple. DynamoStyle databases are usually optimized for use cases that can tolerate eventual consistency.允许通过参数w和rto adjust the probability of reading stale values,But it is unwise to take them as absolute guarantees.
尤其是,Because usually don't get it“复制延迟问题”guarantees discussed in (读己之写,单调读,一致前缀读),The aforementioned exceptions can occur in applications.Stronger guarantees are usually required事务或共识.我们将在第七章和第九章Back to these topics.
4.2.1 监控旧值
运维角度,监视DBWhether or not the latest result is returned is important.Even if the application can tolerate reading old values,You also need to know the current health of the replication.If noticeably delayed,就是信号,The reason needs to be investigated(Such as network problems or node overload).
Master-slave replication system,DBMetrics for replication lag are often exported,It can be integrated into monitoring systems.因为主、Writes from slave nodes follow the same order,Instead, each node maintains the current offset of the replication log execution. By contrast to the main、The current offset value from the node,You can measure the degree to which the slave node lags behind the master node.
Masterless replication system,There is no fixed write order,Monitoring is therefore more difficult.And if the database only uses read repair(No anti-entropy process),Then there is no upper limit to the backwardness of the old value.For example if a value is rarely accessed,then the old value returned may be very old!
A study to measure masterless replicated databases,根据参数n,w和rto predict the expected percentage of old value reads.不幸的是,This is not common practice yet,But it's a good trend to include old measurements in the database's metric set.Eventual consistency is a vague guarantee,operability angle,Energize“最终”很有价值.
边栏推荐
- Huawei's top engineers lasted nine years "anecdotal stories network protocol" PDF document summary, is too strong
- 【pytorch】1.7 pytorch与numpy,tensor与array的转换
- 基于ABP实现DDD
- jeecg主从数据库读写分离配置「建议收藏」
- MySQL常用语句整理
- ML.NET相关资源整理
- Anaconda如何顺利安装CV2
- Website vulnerability repair service provider's analysis of unauthorized vulnerability
- 【Meetup预告】OpenMLDB+OneFlow:链接特征工程到模型训练,加速机器学习模型开发
- 无主复制系统(2)-读写quorum
猜你喜欢
The new BMW 3 Series is on the market, with safety and comfort
MySQL基础篇【单行函数】
What is the difference between BI software in the domestic market?
Implementing distributed locks based on Redis (SETNX), case: Solving oversold orders under high concurrency
i.MX6ULL驱动开发 | 33 - NXP原厂网络设备驱动浅读(LAN8720 PHY)
The 2nd China PWA Developer Day
[pytorch] pytorch automatic derivation, Tensor and Autograd
联邦学习:联邦场景下的多源知识图谱嵌入
【7.29】代码源 - 【排列】【石子游戏 II】【Cow and Snacks】【最小生成数】【数列】
【pytorch】pytorch 自动求导、 Tensor 与 Autograd
随机推荐
C语言-函数
复杂高维医学数据挖掘与疾病风险分类研究
Premiere Pro 2022 for (pr 2022)v22.5.0
MySQL multi-table union query
Qt practical cases (54) - using transparency QPixmap design pictures
LevelSequence源码分析
牛客 HJ19 简单错误记录
字符指针赋值[通俗易懂]
【网络通信三】研华网关Modbus服务设置
Baidu cloud web speed playback (is there any website available)
使用互相关进行音频对齐
type of timer
软件实现AT命令操作过程
Handling write conflicts under multi-master replication (4) - multi-master replication topology
tooltips使用教程(鼠标悬停时显示提示)
Implementing distributed locks based on Redis (SETNX), case: Solving oversold orders under high concurrency
Oracle动态注册非1521端口
深度学习机器学习理论及应用实战-必备知识点整理分享
Intelligent bin (9) - vibration sensor (raspberries pie pico implementation)
基于C语言的编译器设计与实现