首页   快速返回

aerospike架构概述     所属分类 aerospike
https://www.aerospike.com/docs/architecture/index.html


三层架构

Client Layer
This cluster-aware layer includes open source client libraries, 
which implement Aerospike APIs, track nodes, and know where data resides in the cluster.

Clustering and Data Distribution Layer
This layer manages cluster communications and automates fail-over, 
replication, cross data center synchronization, and intelligent re-balancing and data migration.

Data Storage Layer
This layer reliably stores data in DRAM and Flash for fast retrieval. 

客户端
集群感知 开源客户端 api实现 跟踪节点  知道数据在集群中的位置

集群和数据分布
集群通信管理,自动故障转移 复制 跨数据中心同步 只能重新平衡 数据迁移 

存储层
可靠的把数据存储在内存和flash上,快速检索



Client Layer

Implements the Aerospike API, the client-server protocol, and talks directly to the cluster.
直接与集群通信

Tracks nodes and knows where data is stored, 
instantly learning of changes to cluster configuration or when nodes go up or down.
跟踪节点并知道数据存储在哪里,立即发现集群配置更改或节点的上线与下线。


Implements its own TCP/IP connection pool for efficiency. 
Also detects transaction failures that have not risen to the level of node failures in the cluster 
and re-routes those transactions to nodes with copies of the data.

连接池 
失败检测  重新路由  使用副本数据


Transparently sends requests directly to the node with the data 
and re-tries or re-routes requests as needed (for example, during cluster re-configurations).

透明地直接向带有数据的节点发送请求,并根据需要重试或重路由请求(例如,在集群重新配置期间)。



This architecture reduces transaction latency, 
offloads work from the cluster, and eliminates work for the developer. 
It ensures that applications do not have to restart when nodes are brought up or down. 
And you don't have to waste time with cluster setup or add cluster management servers or proxies.

减少事务延迟,从集群中卸载工作,并消除开发人员的工作。
确保在节点启动或关闭时应用程序不必重新启动。
不必浪费时间进行集群设置或添加集群管理服务器或代理。


Distribution Layer


“shared nothing” architecture is designed to reliably store terabytes of data with automatic fail-over, 
replication, and cross data-center synchronization. This layer scales linearly.

shared nothing 架构  可靠存储T级别数据 
自动故障转移、复制和跨数据中心同步
可线性伸缩

The Distribution layer is designed to eliminate manual operations 
with the systematic automation of all cluster management functions. 

分布层的设计是为了消除手工操作,实现所有集群管理功能的系统自动化。

三大模块 

Cluster Management Module
Tracks nodes in the cluster. 
The key algorithm is a Paxos-based gossip-voting process that determines which nodes are considered part of the cluster. 
Aerospike implements a special heartbeat (both active and passive) to monitor inter-node connectivity.

集群管理模块
跟踪集群节点
关键算法 基于paxos的八卦投票过程,它决定哪些节点被认为是集群的一部分。
实现了一个特殊的心跳(主动和被动)来监视节点间的连接。


Data Migration Module 

When you add or remove nodes, Aerospike Database cluster membership is ascertained. 
Each node uses a distributed hash algorithm to 
divide the primary index space into data slices and assign owners. 
The Aerospike Data Migration module intelligently balances data distribution across all nodes in the cluster, 
ensuring that each bit of data replicates across all cluster nodes and datacenters. 
This operation is specified in the system replication factor configuration. 


数据迁移模块

添加或删除节点时,确定集群成员关系。
每个节点使用一个分布式哈希算法,将主索引空间划分为数据片并分配所有者。
智能平衡集群中所有节点的数据分布,确保每个数据位跨所有集群节点和数据中心进行复制。

复制因子配置

Cross-Datacenter Replication  XDR

Division is purely algorithmic. 
The system scales without a master and eliminates the need for additional configuration as required in a sharded environment.

分割切分是纯算法的
系统在没有主服务器的情况下进行扩展,从而消除了在分片环境中需要额外配置的需要。




Transaction Processing Module
事务处理模块

Reads and writes data on request, and provides the consistency and isolation guarantees. 
根据请求读写数据,并提供一致性和隔离性保证。
This module is responsible for 
模块主要负责一下几个方面

Sync/Async Replication: 
For writes with immediate consistency, 
it propagates changes to all replicas before committing the data and returning the result to the client.

同步/异步复制:
即时一致性写,在提交数据并将结果返回给客户机之前,将更改传播到所有副本。


Proxy: 
In rare cases during cluster re-configurations when the Client Layer may be briefly out of date, 
the Transaction Processing module transparently proxys the request to another node.
在集群重新配置期间,客户端层可能会短暂过时,这种情况很少见,透明地将请求代理到另一个节点。


Duplicate Resolution: 
For clusters recovering from being partitioned, 
this module resolves any conflicts between different copies of data. 
Resolution is configurable to be based on the generation (version) or expiration timestamp.

从分区中恢复的集群, 不同副本间的冲突解决 
通过版本或过期时间戳来解决


Data Storage Layer

key-value store   
schemaless data model
 
 
vs RDBMS

namespace   db
set         table
record      row
bin         column

do not need to define sets and bins. For maximum flexibility, they can be added at run-time 
Values in bins are strongly typed
Bins are not typed, so different records can have the same bin with values of different types.

不需要定义 sets 和 bins 
bin中的值是有类型的
不同记录中同一个bin中值类型可以不一样

Indexes, including the primary index and optional secondary indexes, 
are stored by default in DRAM for ultra-fast access. 
The primary index can also be configured to be stored in Persistent Memory or on an NVMe flash device. 
Values can be stored either in DRAM or more cost-effectively on SSDs. 
You can configure each namespace separately, so small namespaces can take advantage of DRAM and larger ones gain the cost benefits of SSDs.

主索引 ,可选的次级索引
索引默认在内存,快速访问
数据可以配置保存在内存或硬盘上
namespace 独立配置 



100 million keys only take up 6.4GB. 
Although keys have no size limitations, each key is efficiently stored in just 64 bytes.

每个key占用64字节

Native, multi-threaded, multi-core Flash I/O 
and an Aerospike log structured file system take advantage of low-level SSD read and write patterns. 
To minimize latency, writes to disk are performed in large blocks. 
This mechanism bypasses the standard file system, historically tuned to rotational disks.

多线程、多核Flash I/O
日志结构文件系统则利用了底层SSD读写模式。绕过标准的文件系统
为了最小化延迟,对磁盘的写操作在大块中执行的。


The Smart Defragmenter and Intelligent Evictor work together to ensure 
that there is space in DRAM and that data is never lost and is always safely written to disk.

智能碎片整理程序和智能回收器共同工作,确保内存空间,数据不丢失,并且总是安全地写到磁盘上。

Smart Defragmenter: 
Tracks the number of active records in each block and reclaims blocks that fall below a minimum level of use.
跟踪每个块中的活动记录数量,并回收低于最低使用级别的块。



Intelligent Evictor: 
Removes expired records and reclaims memory if the system gets beyond a set high-water mark. 
Expiration times are configured per namespace. 
Record age is calculated from the last modification. 
The application can override the default lifetime and specify that a record should never be evicted.

删除过期记录, 当内存使用超过设置的水位时回收内存
记录存活时间根据最近修改计算




Operating Aerospike

divided (distributed)
 
namespace config 

The database schema is created when your application first references the sets and bins (tables and fields).

flex-schema

update the configuration file  , dynamically without a restart 

动态配置 ,立即生效 ,不用重启


If you add a node to the cluster or take down a node for upgrading or servicing, 
the cluster automatically reconfigures. 
When a node fails, other nodes in the cluster rebalance the workload with minimal impact.


删减节点,集群自动重新配置
当一个节点发生故障时,集群中的其他节点会重新平衡工作负载,影响很小。




Building Applications


The Smart Client is location-aware and knows how to store and retrieve data without affecting performance.

The Smart Client is a separate thread/process that monitors cluster state to determine data location, 
which ensures that data is retrieved in a single hop.

The Smart Client allows your application to ignore the data distribution details.

上一篇     下一篇
aerospike data-in-memory 机制说明

aerospike存储引擎配置实例

aerospike存储机制

aerospike缓冲和缓存机制

aerospike写入失败处理queue too deep

aerospike写块大小设置FAQ