summaryrefslogtreecommitdiff
path: root/docs/handbook/ofborg/data-flow.md
blob: 528974d0ce73fcdabc4cfa96f5ab8709a925c56e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
# Tickborg — Data Flow

## Overview

This document traces the complete path of messages through the tickborg system
for the three primary event types: **pull request**, **comment command**, and
**push event**.

---

## Pull Request Flow

A PR opened against the monorepo triggers evaluation and automatic builds.

### Step-by-Step

```
GitHub                    Webhook Receiver         RabbitMQ
───────                   ─────────────────        ────────
POST /github-webhook ───► HMAC verify ──────────►  github-events exchange
  X-Hub-Signature-256       route by event type       routing_key: pull_request.opened
  X-GitHub-Event: pull_request
```

```
RabbitMQ                  Evaluation Filter        RabbitMQ
────────                  ─────────────────        ────────
mass-rebuild-check-inputs  PR filter logic ───────► mass-rebuild-check-jobs
  ◄── github-events          - Repo eligible?           (direct queue publish)
       pull_request.*         - Action interesting?
                              - PR open?
```

```
RabbitMQ                  Mass Rebuilder            RabbitMQ / GitHub
────────                  ──────────────            ─────────────────
mass-rebuild-check-jobs    EvaluationWorker          - Commit status: pending
                           OneEval:                  - Clone + merge PR
                             1. Check PR state       - Detect changed projects
                             2. Clone repo           - Generate labels
                             3. Fetch PR             - Commit status: success
                             4. Merge                - Publish BuildJob(s)
                             5. Detect changes  ──►  build-jobs exchange (fanout)
                             6. Run eval checks
                             7. Tag PR labels   ──►  GitHub API: add labels
```

```
RabbitMQ                  Builder                   RabbitMQ / GitHub
────────                  ───────                   ─────────────────
build-inputs-{id}          BuildWorker               - Check Run: in_progress
  ◄── build-jobs            1. Clone repo            - Publish log lines ──► logs exchange
                            2. Checkout PR           - Check Run: completed
                            3. Detect build system   - Publish BuildResult ──► build-results
                            4. Build
                            5. Test (if requested)
```

```
RabbitMQ                  Comment Poster            GitHub
────────                  ──────────────            ──────
build-results              Format result    ───────► PR comment with build summary
  ◄── build-results         as markdown
```

```
RabbitMQ                  Log Collector             Disk
────────                  ─────────────             ────
build-logs                 LogMessageCollector ────► /var/log/tickborg/builds/{id}.log
  ◄── logs exchange
       logs.*
```

### Sequence Diagram

```
GitHub ──► Webhook Receiver ──► [github-events]
                                     │
                          pull_request.*
                                     ▼
                              Evaluation Filter
                                     │
                                     ▼
                         [mass-rebuild-check-jobs]
                                     │
                                     ▼
                             Mass Rebuilder ──► GitHub (status + labels)
                                     │
                              BuildJob × N
                                     ▼
                              [build-jobs]
                                     │
                                     ▼
                                Builder ──► GitHub (check run)
                               /       \
                          [logs]    [build-results]
                            │           │
                            ▼           ▼
                      Log Collector  Comment Poster ──► GitHub (PR comment)
```

---

## Comment Command Flow

A user posts `@tickbot build meshmc` on a PR.

### Step-by-Step

```
GitHub                    Webhook Receiver         RabbitMQ
───────                   ─────────────────        ────────
POST /github-webhook ───► HMAC verify ──────────►  github-events exchange
  X-GitHub-Event:            route: issue_comment     routing_key: issue_comment.created
    issue_comment
```

```
RabbitMQ                  Comment Filter           RabbitMQ
────────                  ──────────────           ────────
comment-jobs               GitHubCommentWorker      build-jobs exchange
  ◄── github-events          1. Ignore !Created
       issue_comment.*        2. Parse @tickbot
                              3. Extract instruction
                              4. ACL check
                              5. Produce BuildJob(s) ──►  build-jobs (fanout)
```

The rest of the flow (builder → log collector → comment poster) is identical
to the PR flow.

### Comment Parser Detail

```
Input:  "@tickbot build meshmc neozip"

commentparser::parse()
    ┌──────────────────────────────────────────┐
    │  nom parser pipeline:                     │
    │  1. tag("@tickbot")                       │
    │  2. space1                                │
    │  3. alt((tag("build"), tag("test"),        │
    │         tag("eval")))                      │
    │  4. space1                                │
    │  5. separated_list1(space1, alphanumeric1) │
    └──────────────────────────────────────────┘

Output: [Instruction::Build(["meshmc", "neozip"], Subset::Project)]
```

### Message Expansion

A single comment can generate multiple AMQP messages:

```
@tickbot build meshmc
    │
    ▼
ACL: user allowed on [x86_64-linux, aarch64-linux, x86_64-darwin]
    │
    ▼
3 BuildJob messages:
  ├── BuildJob { project: "meshmc", system: "x86_64-linux", ... }
  ├── BuildJob { project: "meshmc", system: "aarch64-linux", ... }
  └── BuildJob { project: "meshmc", system: "x86_64-darwin", ... }
```

---

## Push Event Flow

A push to a tracked branch (e.g., `main`).

### Step-by-Step

```
GitHub                    Webhook Receiver         RabbitMQ
───────                   ─────────────────        ────────
POST /github-webhook ───► HMAC verify ──────────►  github-events exchange
  X-GitHub-Event: push       route: push              routing_key: push.push
```

```
RabbitMQ                  Push Filter              RabbitMQ / External
────────                  ───────────              ─────────────────
push-jobs                  PushFilterWorker
  ◄── github-events          1. Skip tags
       push.*                 2. Skip deletes
                              3. Skip zero-SHA
                              4. Check branch name
                              5. Trigger rebuild  ──►  (future: deployment hooks)
```

### Push Event Guards

```rust
impl worker::SimpleWorker for PushFilterWorker {
    async fn consumer(&mut self, job: &ghevent::PushEvent) -> worker::Actions {
        // Skip tags
        if job.is_tag() {
            return vec![worker::Action::Ack];
        }

        // Skip branch deletions
        if job.is_delete() {
            return vec![worker::Action::Ack];
        }

        // Skip zero-SHA (orphan push)
        if job.is_zero_sha() {
            return vec![worker::Action::Ack];
        }

        // Only process main branch
        if job.branch() != Some("main") {
            return vec![worker::Action::Ack];
        }

        // Process the push event...
    }
}
```

---

## Statistics Flow

All services emit `EventMessage` events to the stats exchange.

```
Any Service
    │
    ├── worker::Action::Publish ──► [stats] exchange (fanout)
    │                                    │
    │                                    ▼
    │                              stats-events queue
    │                                    │
    │                                    ▼
    │                             StatCollectorWorker
    │                                    │
    └── Metrics:                         ▼
        - JobReceived              MetricCollector
        - JobDecodeSuccess            │
        - JobDecodeFailure            ▼
        - BuildStarted            HTTP endpoint (:9090)
        - BuildCompleted             /metrics
        - EvalStarted
        - EvalCompleted
```

### `SysEvents` Trait

```rust
// stats.rs
pub trait SysEvents: Send {
    fn notify(&mut self, event: Event)
        -> impl Future<Output = ()>;
}
```

Every worker is generic over `E: SysEvents`, allowing stats collection
to be plugged in or replaced with a no-op.

---

## Log Collection Flow

Build logs are streamed in real-time via the `logs` exchange.

```
Builder (BuildWorker)
    │
    │  During build execution, for each output line:
    │
    ├── BuildLogStart { /* ... */ }     ──► [logs] routing_key: logs.{attempt_id}
    ├── BuildLogMsg { line: "..." }     ──► [logs] routing_key: logs.{attempt_id}
    ├── BuildLogMsg { line: "..." }     ──► [logs] routing_key: logs.{attempt_id}
    └── BuildLogMsg { line: "..." }     ──► [logs] routing_key: logs.{attempt_id}
```

```
RabbitMQ                  Log Collector            Disk
────────                  ─────────────            ────
build-logs                 LogMessageCollector
  ◄── logs                   matches by attempt_id
       logs.*                writes to file:
                             {log_storage_path}/{attempt_id}.log
```

### `LogFrom` Enum

```rust
pub enum LogFrom {
    Worker(BuildLogMsg),
    Start(BuildLogStart),
}
```

The collector distinguishes between log start (creates the file with metadata
header) and log lines (appends to the file).

---

## Message Format Summary

All messages are JSON-serialized via `serde_json`. Key message types and their
flows:

| Message Type | Producer | Consumer | Exchange |
|-------------|----------|----------|----------|
| `PullRequestEvent` | Webhook Receiver | Evaluation Filter | `github-events` |
| `IssueComment` | Webhook Receiver | Comment Filter | `github-events` |
| `PushEvent` | Webhook Receiver | Push Filter | `github-events` |
| `EvaluationJob` | Eval Filter / Comment Filter | Mass Rebuilder | _(direct queue)_ |
| `BuildJob` | Mass Rebuilder / Comment Filter | Builder | `build-jobs` |
| `BuildResult` | Builder | Comment Poster, Stats | `build-results` |
| `BuildLogMsg` | Builder | Log Collector | `logs` |
| `EventMessage` | Any service | Stats Collector | `stats` |

---

## Failure Modes and Recovery

### Transient Failures

| Failure | Recovery Mechanism |
|---------|-------------------|
| GitHub API 401 (expired token) | `NackRequeue` → retry after token refresh |
| GitHub API 5xx | `NackRequeue` → retry |
| RabbitMQ connection lost | `lapin` reconnect / systemd restart |
| Build timeout | `BuildStatus::TimedOut` → report to GitHub |

### Permanent Failures

| Failure | Handling |
|---------|----------|
| Invalid message JSON | `Ack` (discard) + log error |
| PR force-pushed (SHA gone) | `Ack` (skip) — `MissingSha` |
| GitHub API 4xx (not 401/422) | `Ack` + add `tickborg-internal-error` label |
| Merge conflict | Report failure status to GitHub, `Ack` |

### Dead Letter Behavior

Messages `NackDump`'d (rejected without requeue) are discarded unless a
dead-letter exchange is configured in RabbitMQ. This is used for permanently
invalid messages that should not be retried.