1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
|
# cmark — Render Framework
## Overview
The render framework (`render.c`, `render.h`) provides a generic rendering infrastructure used by three of the five renderers: LaTeX, man, and CommonMark. It handles line wrapping, prefix management, and character-level output dispatch. The HTML and XML renderers bypass this framework and write directly to buffers.
## The `cmark_renderer` Structure
```c
struct cmark_renderer {
cmark_mem *mem;
cmark_strbuf *buffer; // Output buffer
cmark_strbuf *prefix; // Current line prefix (e.g., "> " for blockquotes)
int column; // Current column position (for wrapping)
int width; // Target width (0 = no wrapping)
int need_cr; // Pending newlines count
bufsize_t last_breakable; // Position of last breakable point in buffer
bool begin_line; // True if at the start of a line
bool begin_content; // True if no content has been output on current line (after prefix)
bool no_linebreaks; // Suppress newlines (for rendering within attributes)
bool in_tight_list_item; // Currently inside a tight list item
void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char);
// Per-character output callback
int32_t (*render_node)(cmark_renderer *, cmark_node *, cmark_event_type, int);
// Per-node render callback
};
```
### Key Fields
- **`column`** — Tracks horizontal position for word-wrap decisions.
- **`width`** — If > 0, enables automatic line wrapping at word boundaries.
- **`prefix`** — Accumulated prefix string. For nested block quotes and list items, prefixes stack (e.g., `"> - "` for a list item inside a block quote).
- **`last_breakable`** — Buffer position of the last whitespace where a line break could be inserted. Used for retroactive line wrapping.
- **`begin_line`** — True immediately after a newline. Used by renderers to decide whether to escape line-start characters.
- **`begin_content`** — True until the first non-prefix content on a line. Distinguished from `begin_line` because the prefix itself isn't "content".
- **`no_linebreaks`** — When true, newlines are converted to spaces. Used when rendering content inside constructs that can't contain literal newlines.
## Entry Point
```c
char *cmark_render(cmark_mem *mem, cmark_node *root, int options, int width,
void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char),
int32_t (*render_node)(cmark_renderer *, cmark_node *,
cmark_event_type, int)) {
cmark_renderer renderer = {
mem,
&buf, // buffer
&pref, // prefix
0, // column
width, // width
0, // need_cr
0, // last_breakable
true, // begin_line
true, // begin_content
false, // no_linebreaks
false, // in_tight_list_item
outc, // outc
render_node // render_node
};
// ... iterate AST, call render_node for each event
return (char *)cmark_strbuf_detach(&buf);
}
```
The framework creates a `cmark_renderer`, iterates over the AST using `cmark_iter`, and calls the provided `render_node` function for each event. The `outc` callback handles per-character output with escaping decisions.
## Escaping Modes
```c
typedef enum {
LITERAL, // No escaping — output characters as-is
NORMAL, // Full escaping for prose text
TITLE, // Escaping for link titles
URL, // Escaping for URLs
} cmark_escaping;
```
Each renderer's `outc` function switches on this enum to determine how to handle special characters.
## Output Functions
### `cmark_render_code_point()`
```c
void cmark_render_code_point(cmark_renderer *renderer, int32_t c) {
cmark_utf8proc_encode_char(c, renderer->buffer);
renderer->column += 1;
}
```
Low-level: encodes a single Unicode codepoint as UTF-8 into the buffer and advances the column counter.
### `cmark_render_ascii()`
```c
void cmark_render_ascii(cmark_renderer *renderer, const char *s) {
int len = (int)strlen(s);
cmark_strbuf_puts(renderer->buffer, s);
renderer->column += len;
}
```
Outputs an ASCII string and advances the column counter. Used for fixed escape sequences like `\&`, `\textbf{`, etc.
### `S_out()` — Main Output Dispatcher
```c
static CMARK_INLINE void S_out(cmark_renderer *renderer, const char *source,
bool wrap, cmark_escaping escape) {
int length = (int)strlen(source);
unsigned char nextc;
int32_t c;
int i = 0;
int len;
cmark_chunk remainder = cmark_chunk_literal("");
int k = renderer->buffer->size - 1;
wrap = wrap && !renderer->no_linebreaks;
if (renderer->need_cr) {
// Output pending newlines
while (renderer->need_cr > 0) {
S_cr(renderer);
renderer->need_cr--;
}
}
while (i < length) {
if (renderer->begin_line) {
// Output prefix at start of each line
cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr);
renderer->column = renderer->prefix->size;
renderer->begin_line = false;
renderer->begin_content = true;
}
len = cmark_utf8proc_charlen((uint8_t *)source + i, length - i);
if (len == -1) { // Invalid UTF-8
// ... handle error
}
cmark_utf8proc_iterate((uint8_t *)source + i, len, &c);
if (c == 10) {
// Newline
cmark_strbuf_putc(renderer->buffer, '\n');
renderer->column = 0;
renderer->begin_line = true;
renderer->begin_content = true;
renderer->last_breakable = 0;
} else if (wrap) {
if (c == 32 && renderer->column > renderer->width / 2) {
// Space past half-width — mark as potential break point
renderer->last_breakable = renderer->buffer->size;
cmark_render_code_point(renderer, c);
} else if (renderer->column > renderer->width &&
renderer->last_breakable > 0) {
// Past target width with a break point — retroactively break
// Replace the space at last_breakable with newline + prefix
// ...
} else {
renderer->outc(renderer, escape, c, nextc);
}
} else {
renderer->outc(renderer, escape, c, nextc);
}
if (c != 10) {
renderer->begin_content = false;
}
i += len;
}
}
```
This is the core output function. It:
1. Handles deferred newlines (`need_cr`)
2. Outputs line prefixes at the start of each line
3. Tracks column position
4. Implements word wrapping via retroactive line breaks
5. Delegates character-level escaping to `renderer->outc()`
### Line Wrapping Algorithm
The wrapping algorithm uses a **retroactive break** strategy:
1. As text flows through `S_out()`, spaces past the half-width mark are recorded as potential break points (`last_breakable`).
2. When the column exceeds `width`, the buffer is split at `last_breakable`:
- Everything after the break point is saved in `remainder`
- A newline and the current prefix are inserted at the break point
- The remainder is reappended
This avoids forward-looking: the renderer doesn't need to know the length of upcoming content to decide where to break.
```c
// Retroactive line break:
remainder = cmark_chunk_dup(&renderer->buffer->..., last_breakable, ...);
cmark_strbuf_truncate(renderer->buffer, last_breakable);
cmark_strbuf_putc(renderer->buffer, '\n');
cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr);
cmark_strbuf_put(renderer->buffer, remainder.data, remainder.len);
renderer->column = renderer->prefix->size + cmark_chunk_len(&remainder);
renderer->last_breakable = 0;
renderer->begin_line = false;
renderer->begin_content = false;
```
## Convenience Functions
### `CR()`
```c
#define CR() renderer->need_cr = 1
```
Requests a newline before the next content output. Multiple `CR()` calls don't stack — only one newline is inserted.
### `BLANKLINE()`
```c
#define BLANKLINE() renderer->need_cr = 2
```
Requests a blank line (two newlines) before the next content output.
### `OUT()`
```c
#define OUT(s, wrap, escaping) (S_out(renderer, s, wrap, escaping))
```
### `LIT()`
```c
#define LIT(s) (S_out(renderer, s, false, LITERAL))
```
Output literal text (no escaping, no wrapping).
### `NOBREAKS()`
```c
#define NOBREAKS(s) \
do { renderer->no_linebreaks = true; OUT(s, false, NORMAL); renderer->no_linebreaks = false; } while(0)
```
Output text with normal escaping but with newlines suppressed (converted to spaces).
## Prefix Management
Prefixes are used for block-level indentation. The renderer maintains a `cmark_strbuf` prefix that is output at the start of each line.
### Usage Pattern
```c
// In commonmark.c, entering a block quote:
cmark_strbuf_puts(renderer->prefix, "> ");
// ... render children ...
// On exit:
cmark_strbuf_truncate(renderer->prefix, original_prefix_len);
```
Renderers save the prefix length before modifying it and restore it on exit. This creates a stack-like behavior for nested containers.
## Framework vs Direct Rendering
| Feature | Framework (render.c) | Direct (html.c, xml.c) |
|---------|---------------------|----------------------|
| Line wrapping | Yes (`width` parameter) | No |
| Prefix management | Yes (automatic) | No (uses HTML tags) |
| Per-char escaping | Via `outc` callback | Via `escape_html()` helper |
| Column tracking | Yes | No |
| Break points | Retroactive insertion | N/A |
| `cmark_escaping` enum | Yes | No |
## Which Renderers Use the Framework
| Renderer | Uses Framework | Why/Why Not |
|----------|---------------|-------------|
| LaTeX (`latex.c`) | Yes | Needs wrapping for structured text |
| man (`man.c`) | Yes | Needs wrapping for terminal display |
| CommonMark (`commonmark.c`) | Yes | Needs wrapping and prefix management |
| HTML (`html.c`) | No | HTML handles layout via browser |
| XML (`xml.c`) | No | XML output is structural, not visual |
## Cross-References
- [render.c](../../cmark/src/render.c) — Framework implementation
- [render.h](../../cmark/src/render.h) — `cmark_renderer` struct and `cmark_escaping` enum
- [latex-renderer.md](latex-renderer.md) — LaTeX `outc` and `S_render_node`
- [man-renderer.md](man-renderer.md) — Man `S_outc` and `S_render_node`
- [commonmark-renderer.md](commonmark-renderer.md) — CommonMark `outc` and `S_render_node`
- [html-renderer.md](html-renderer.md) — Direct renderer (no framework)
|