文章详情|temporal 技术要点

temporal 技术要点 所属分类 temporal 浏览量 983

根据原文整理
Temporal architecture deep dive
https://docs.temporal.io/blog/workflow-engine-principles/

Task Queues
Timers
Consistency
Scalability
Sharding and Routing
System Workflows
External Workflow and Activity Implementations

Task Queues
In practical systems, you don't want to call tasks directly,
because there can be issues with flow control, availability, or slowness.
So using queues to dispatch tasks is a very common technique.

流控 可用性 慢速度

Every practical workflow engine uses queues to dispatch tasks to workers
(working processes that host those tasks).

Timer Queue

need an external timer service or timer queue that durably stores and dispatches these timers.

Consistency
start a workflow
create state in the database
create a timer
pull tasks from the task queue for the WorkflowDefinition to pick up
when it gives a list of commands we need to update and create tasks and update the state

Workflow as unit of scalability
Every workflow should be limited in size, but we can infinitely scale out the number of workflows

partitioned
Sharding

use partitions within the database,
over-allocate the number of partitions,
allocate those partitions to specific physical hosts, and move them around if necessary.

数据库中使用分区，
分配足够多得分区数量，
将这些分区分配给特定的物理主机，并在必要时移动它们

hash workflows to a specific shard id and use consistent hashing to allocate a shard to a specific host.

Membership and Routing

routing layer

You don't want to have a fat clientside library that understands the topology of your cluster,
so you will need to have a frontend which will know the membership of the cluster and route requests accordingly.

move the queue into its a separate component with its own persistence.

Local Transfer Queues

every shard which stores workflow state also stores a queue
Every time we make an update to a shard we can also make an update to the queue because it lives in the same partition.

start a workflow

create a state for that workflow
create workflow tasks for the worker to pick up
add the task to the local queue of that shard
This will be committed to the database atomically
a thread pulls from that queue and transfers that message to the queuing subsystem.

Workflow Visibility
the ability to list workflows

Going to all 10,000 shards and asking for this information, even with indexing, would be impractical.
The way to solve this is to have a separate indexing component.

建立单独的索引

System Workflows

the History component is responsible for state transitions of individual workflows
we have Transfer Queues to be able to transactionally create tasks
a Matching component responsible for delivering tasks for queuing and matching poll for requests coming from external workers
a Front-end component because we need routing
an ElasticSearch component for indexing
a Worker component for background jobs.

Workflows and Activities to be implemented by application developers using one of the Temporal SDKs.

Multi-cluster Deployment
Shards, history, and matching will be redistributed automatically.
Clustered databases like Cassandra can sustain node failures.
Elasticsearch is fault tolerant.
Frontends are stateless so you can add and remove them anytime

If you want to provide very high availability, we have multi-cluster deployment with asynchronous replication.

高可用 多集群 异步复制

k8s核心概念

prometheus irate 函数说明

temporal cassandra 阶梯压测记录

清晨的一点点感悟

k3s安装

sh -s 用法