diff options
| author | Mehmet Samet Duman <yongdohyun@projecttick.org> | 2026-04-04 12:41:27 +0300 |
|---|---|---|
| committer | Mehmet Samet Duman <yongdohyun@projecttick.org> | 2026-04-04 12:41:27 +0300 |
| commit | 4f2d36194b4f299aa7509d815c07121039ea833b (patch) | |
| tree | f3ded014bad3a4c76ff6a22b8726ebaab68c3d13 /uvim/runtime/doc/pattern.txt | |
| parent | 5b578e70c314723a3cde5c9bfc2be0bf1dadc93b (diff) | |
| download | Project-Tick-4f2d36194b4f299aa7509d815c07121039ea833b.tar.gz Project-Tick-4f2d36194b4f299aa7509d815c07121039ea833b.zip | |
NOISSUE change uvim folder name to mnv
Signed-off-by: Mehmet Samet Duman <yongdohyun@projecttick.org>
Diffstat (limited to 'uvim/runtime/doc/pattern.txt')
| -rw-r--r-- | uvim/runtime/doc/pattern.txt | 1560 |
1 files changed, 0 insertions, 1560 deletions
diff --git a/uvim/runtime/doc/pattern.txt b/uvim/runtime/doc/pattern.txt deleted file mode 100644 index 0b6dfc9050..0000000000 --- a/uvim/runtime/doc/pattern.txt +++ /dev/null @@ -1,1560 +0,0 @@ -*pattern.txt* For MNV version 10.0. Last change: 2026 Feb 14 - - - MNV REFERENCE MANUAL by Bram Moolenaar - - -Patterns and search commands *pattern-searches* - -The very basics can be found in section |03.9| of the user manual. A few more -explanations are in chapter 27 |usr_27.txt|. - -1. Search commands |search-commands| -2. The definition of a pattern |search-pattern| -3. Magic |/magic| -4. Overview of pattern items |pattern-overview| -5. Multi items |pattern-multi-items| -6. Ordinary atoms |pattern-atoms| -7. Ignoring case in a pattern |/ignorecase| -8. Composing characters |patterns-composing| -9. Compare with Perl patterns |perl-patterns| -10. Highlighting matches |match-highlight| -11. Fuzzy matching |fuzzy-matching| - -============================================================================== -1. Search commands *search-commands* - - */* -/{pattern}[/]<CR> Search forward for the [count]'th occurrence of - {pattern} |exclusive|. - -/{pattern}/{offset}<CR> Search forward for the [count]'th occurrence of - {pattern} and go |{offset}| lines up or down. - |linewise|. - - */<CR>* -/<CR> Search forward for the [count]'th occurrence of the - latest used pattern |last-pattern| with latest used - |{offset}|. - -//{offset}<CR> Search forward for the [count]'th occurrence of the - latest used pattern |last-pattern| with new - |{offset}|. If {offset} is empty no offset is used. - - *?* -?{pattern}[?]<CR> Search backward for the [count]'th previous - occurrence of {pattern} |exclusive|. - -?{pattern}?{offset}<CR> Search backward for the [count]'th previous - occurrence of {pattern} and go |{offset}| lines up or - down |linewise|. - - *?<CR>* -?<CR> Search backward for the [count]'th occurrence of the - latest used pattern |last-pattern| with latest used - |{offset}|. - -??{offset}<CR> Search backward for the [count]'th occurrence of the - latest used pattern |last-pattern| with new - |{offset}|. If {offset} is empty no offset is used. - - *n* -n Repeat the latest "/" or "?" [count] times. - If the cursor doesn't move the search is repeated with - count + 1. - |last-pattern| - - *N* -N Repeat the latest "/" or "?" [count] times in - opposite direction. |last-pattern| - - *star* *E348* *E349* -* Search forward for the [count]'th occurrence of the - word nearest to the cursor. The word used for the - search is the first of: - 1. the keyword under the cursor 'iskeyword' - 2. the first keyword after the cursor, in the - current line - 3. the non-blank word under the cursor - 4. the first non-blank word after the cursor, - in the current line - Only whole keywords are searched for, like with the - command "/\<keyword\>". |exclusive| - 'ignorecase' is used, 'smartcase' is not. - - *#* -# Same as "*", but search backward. The pound sign - (character 163) also works. If the "#" key works as - backspace, try using "stty erase <BS>" before starting - MNV (<BS> is CTRL-H or a real backspace). - - *gstar* -g* Like "*", but don't put "\<" and "\>" around the word. - This makes the search also find matches that are not a - whole word. - - *g#* -g# Like "#", but don't put "\<" and "\>" around the word. - This makes the search also find matches that are not a - whole word. - - *gd* -gd Goto local Declaration. When the cursor is on a local - variable, this command will jump to its declaration. - This was made to work for C code, in other languages - it may not work well. - First MNV searches for the start of the current - function, just like "[[". If it is not found the - search stops in line 1. If it is found, MNV goes back - until a blank line is found. From this position MNV - searches for the keyword under the cursor, like with - "*", but lines that look like a comment are ignored - (see 'comments' option). - Note that this is not guaranteed to work, MNV does not - really check the syntax, it only searches for a match - with the keyword. If included files also need to be - searched use the commands listed in |include-search|. - After this command |n| searches forward for the next - match (not backward). - - *gD* -gD Goto global Declaration. When the cursor is on a - global variable that is defined in the file, this - command will jump to its declaration. This works just - like "gd", except that the search for the keyword - always starts in line 1. - - *1gd* -1gd Like "gd", but ignore matches inside a {} block that - ends before the cursor position. - - *1gD* -1gD Like "gD", but ignore matches inside a {} block that - ends before the cursor position. - - *CTRL-C* -CTRL-C Interrupt current (search) command. Use CTRL-Break on - MS-Windows |dos-CTRL-Break|. - In Normal mode, any pending command is aborted. - When MNV was started with output redirected and there - are no changed buffers CTRL-C exits MNV. That is to - help users who use "mnv file | grep word" and don't - know how to get out (blindly typing :qa<CR> would - work). - If a popup with a |popup-filter| is open, the popup - will be closed. - - *:noh* *:nohlsearch* -:noh[lsearch] Stop the highlighting for the 'hlsearch' option. It - is automatically turned back on when using a search - command, or setting the 'hlsearch' option. - This command doesn't work in an autocommand, because - the highlighting state is saved and restored when - executing autocommands |autocmd-searchpat|. - Same thing for when invoking a user function. - - -While typing the search pattern the current match will be shown if the -'incsearch' option is on. Remember that you still have to finish the search -command with <CR> to actually position the cursor at the displayed match. Or -use <Esc> to abandon the search. - - *nohlsearch-auto* -All matches for the last used search pattern will be highlighted if you set -the 'hlsearch' option. This can be suspended with the |:nohlsearch| command -or auto suspended with nohlsearch plugin. See |nohlsearch-install|. - - -When 'shortmess' does not include the "S" flag, MNV will automatically show an -index, on which the cursor is. This can look like this: > - - [1/5] Cursor is on first of 5 matches. - [1/>99] Cursor is on first of more than 99 matches. - [>99/>99] Cursor is after 99 match of more than 99 matches. - [?/??] Unknown how many matches exists, generating the - statistics was aborted because of search timeout. - -Note: the count does not take offset into account. - -When no match is found you get the error: *E486* Pattern not found -Note that for the `:global` command, when used in legacy script, you get a -normal message "Pattern not found", for Vi compatibility. -In |MNV9| script you get E486 for "pattern not found" or *E538* when the pattern -matches in every line with `:vglobal`. -For the |:s| command the "e" flag can be used to avoid the error message -|:s_flags|. - - *search-options* -The following options affect how a search is performed in MNV: - 'hlsearch' highlight matches - 'ignorecase' ignore case when searching - 'imsearch' use |IME| when entering the search pattern - 'incsearch' show matches incrementally as the pattern is typed - 'maxsearchcount' maximum number for the search count |shm-S| - 'shortmess' suppress messages |shm-s|; show search count |shm-S| - 'smartcase' override 'ignorecase' if pattern contains uppercase - 'wrapscan' continue searching from the start of the file - - *search-offset* *{offset}* -These commands search for the specified pattern. With "/" and "?" an -additional offset may be given. There are two types of offsets: line offsets -and character offsets. - -The offset gives the cursor position relative to the found match: - [num] [num] lines downwards, in column 1 - +[num] [num] lines downwards, in column 1 - -[num] [num] lines upwards, in column 1 - e[+num] [num] characters to the right of the end of the match - e[-num] [num] characters to the left of the end of the match - s[+num] [num] characters to the right of the start of the match - s[-num] [num] characters to the left of the start of the match - b[+num] [num] identical to s[+num] above (mnemonic: begin) - b[-num] [num] identical to s[-num] above (mnemonic: begin) - ;{pattern} perform another search, see |//;| - -If a '-' or '+' is given but [num] is omitted, a count of one will be used. -When including an offset with 'e', the search becomes inclusive (the -character the cursor lands on is included in operations). - -Examples: - -pattern cursor position ~ -/test/+1 one line below "test", in column 1 -/test/e on the last t of "test" -/test/s+2 on the 's' of "test" -/test/b-3 three characters before "test" - -If one of these commands is used after an operator, the characters between -the cursor position before and after the search is affected. However, if a -line offset is given, the whole lines between the two cursor positions are -affected. - -An example of how to search for matches with a pattern and change the match -with another word: > - /foo<CR> find "foo" - c//e<CR> change until end of match - bar<Esc> type replacement - //<CR> go to start of next match - c//e<CR> change until end of match - beep<Esc> type another replacement - etc. -< - *//;* *E386* -A very special offset is ';' followed by another search command. For example: > - - /test 1/;/test - /test.*/+1;?ing? - -The first one first finds the next occurrence of "test 1", and then the first -occurrence of "test" after that. - -This is like executing two search commands after each other, except that: -- It can be used as a single motion command after an operator. -- The direction for a following "n" or "N" command comes from the first - search command. -- When an error occurs the cursor is not moved at all. - - *last-pattern* -The last used pattern and offset are remembered. They can be used to repeat -the search, possibly in another direction or with another count. Note that -two patterns are remembered: One for "normal" search commands and one for the -substitute command ":s". Each time an empty pattern is given, the previously -used pattern is used. However, if there is no previous search command, a -previous substitute pattern is used, if possible. - -The 'magic' option sticks with the last used pattern. If you change 'magic', -this will not change how the last used pattern will be interpreted. -The 'ignorecase' option does not do this. When 'ignorecase' is changed, it -will result in the pattern to match other text. - -All matches for the last used search pattern will be highlighted if you set -the 'hlsearch' option. - -To clear the last used search pattern: > - :let @/ = "" -This will not set the pattern to an empty string, because that would match -everywhere. The pattern is really cleared, like when starting MNV. - -The search usually skips matches that don't move the cursor. Whether the next -match is found at the next character or after the skipped match depends on the -'c' flag in 'cpoptions'. See |cpo-c|. - with 'c' flag: "/..." advances 1 to 3 characters - without 'c' flag: "/..." advances 1 character -The unpredictability with the 'c' flag is caused by starting the search in the -first column, skipping matches until one is found past the cursor position. - -When searching backwards, searching starts at the start of the line, using the -'c' flag in 'cpoptions' as described above. Then the last match before the -cursor position is used. - -In Vi the ":tag" command sets the last search pattern when the tag is searched -for. In MNV this is not done, the previous search pattern is still -remembered, unless the 't' flag is present in 'cpoptions'. The search pattern -is always put in the search history. - -If the 'wrapscan' option is on (which is the default), searches wrap around -the end of the buffer. If 'wrapscan' is not set, the backward search stops -at the beginning and the forward search stops at the end of the buffer. If -'wrapscan' is set and the pattern was not found the error message "pattern -not found" is given, and the cursor will not be moved. If 'wrapscan' is not -set the message becomes "search hit BOTTOM without match" when searching -forward, or "search hit TOP without match" when searching backward. If -wrapscan is set and the search wraps around the end of the file the message -"search hit TOP, continuing at BOTTOM" or "search hit BOTTOM, continuing at -TOP" is given when searching backwards or forwards respectively. This can be -switched off by setting the 's' flag in the 'shortmess' option. The highlight -method 'w' is used for this message (default: standout). - - *search-range* -You can limit the search command "/" to a certain range of lines by including -\%>l items. For example, to match the word "limit" below line 199 and above -line 300: > - /\%>199l\%<300llimit -Also see |/\%>l|. - -Another way is to use the ":substitute" command with the 'c' flag. Example: > - :.,300s/Pattern//gc -This command will search from the cursor position until line 300 for -"Pattern". At the match, you will be asked to type a character. Type 'q' to -stop at this match, type 'n' to find the next match. - -The "*", "#", "g*" and "g#" commands look for a word near the cursor in this -order, the first one that is found is used: -- The keyword currently under the cursor. -- The first keyword to the right of the cursor, in the same line. -- The WORD currently under the cursor. -- The first WORD to the right of the cursor, in the same line. -The keyword may only contain letters and characters in 'iskeyword'. -The WORD may contain any non-blanks (<Tab>s and/or <Space>s). -Note that if you type with ten fingers, the characters are easy to remember: -the "#" is under your left hand middle finger (search to the left and up) and -the "*" is under your right hand middle finger (search to the right and down). -(this depends on your keyboard layout though). - - *E956* -In very rare cases a regular expression is used recursively. This can happen -when executing a pattern takes a long time and when checking for messages on -channels a callback is invoked that also uses a pattern or an autocommand is -triggered. In most cases this should be fine, but if a pattern is in use when -it's used again it fails. Usually this means there is something wrong with -the pattern. - -============================================================================== -2. The definition of a pattern *search-pattern* *pattern* *[pattern]* - *regular-expression* *regexp* *Pattern* - *E383* *E476* - -For starters, read chapter 27 of the user manual |usr_27.txt|. - - */bar* */\bar* */pattern* -1. A pattern is one or more branches, separated by "\|". It matches anything - that matches one of the branches. Example: "foo\|beep" matches "foo" and - matches "beep". If more than one branch matches, the first one is used. - - pattern ::= branch - or branch \| branch - or branch \| branch \| branch - etc. - - */branch* */\&* -2. A branch is one or more concats, separated by "\&". It matches the last - concat, but only if all the preceding concats also match at the same - position. Examples: - "foobeep\&..." matches "foo" in "foobeep". - ".*Peter\&.*Bob" matches in a line containing both "Peter" and "Bob" - - branch ::= concat - or concat \& concat - or concat \& concat \& concat - etc. - - */concat* -3. A concat is one or more pieces, concatenated. It matches a match for the - first piece, followed by a match for the second piece, etc. Example: - "f[0-9]b", first matches "f", then a digit and then "b". - - concat ::= piece - or piece piece - or piece piece piece - etc. - - */piece* -4. A piece is an atom, possibly followed by a multi, an indication of how many - times the atom can be matched. Example: "a*" matches any sequence of "a" - characters: "", "a", "aa", etc. See |/multi|. - - piece ::= atom - or atom multi - - */atom* -5. An atom can be one of a long list of items. Many atoms match one character - in the text. It is often an ordinary character or a character class. - Parentheses can be used to make a pattern into an atom. The "\z(\)" - construct is only for syntax highlighting. - - atom ::= ordinary-atom |/ordinary-atom| - or \( pattern \) |/\(| - or \%( pattern \) |/\%(| - or \z( pattern \) |/\z(| - - - */\%#=* *two-engines* *NFA* -MNV includes two regexp engines: -1. An old, backtracking engine that supports everything. -2. A new, NFA engine that works much faster on some patterns, possibly slower - on some patterns. - *E1281* -MNV will automatically select the right engine for you. However, if you run -into a problem or want to specifically select one engine or the other, you can -prepend one of the following to the pattern: - - \%#=0 Force automatic selection. Only has an effect when - 'regexpengine' has been set to a non-zero value. - \%#=1 Force using the old engine. - \%#=2 Force using the NFA engine. - -You can also use the 'regexpengine' option to change the default. - - *E864* *E868* *E874* *E875* *E876* *E877* *E878* -If selecting the NFA engine and it runs into something that is not implemented -the pattern will not match. This is only useful when debugging MNV. - -============================================================================== -3. Magic */magic* - -Some characters in the pattern, such as letters, are taken literally. They -match exactly the same character in the text. When preceded with a backslash -however, these characters may get a special meaning. For example, "a" matches -the letter "a", while "\a" matches any alphabetic character. - -Other characters have a special meaning without a backslash. They need to be -preceded with a backslash to match literally. For example "." matches any -character while "\." matches a dot. - -If a character is taken literally or not depends on the 'magic' option and the -items in the pattern mentioned next. The 'magic' option should always be set, -but it can be switched off for Vi compatibility. We mention the effect of -'nomagic' here for completeness, but we recommend against using that. - */\m* */\M* -Use of "\m" makes the pattern after it be interpreted as if 'magic' is set, -ignoring the actual value of the 'magic' option. -Use of "\M" makes the pattern after it be interpreted as if 'nomagic' is used. - */\v* */\V* -Use of "\v" means that after it, all ASCII characters except '0'-'9', 'a'-'z', -'A'-'Z' and '_' have special meaning: "very magic" - -Use of "\V" means that after it, only a backslash and the terminating -character (usually / or ?) have special meaning: "very nomagic" - -Examples: -after: \v \m \M \V matches ~ - 'magic' 'nomagic' - a a a a literal 'a' - \a \a \a \a any alphabetic character - . . \. \. any character - \. \. . . literal dot - $ $ $ \$ end-of-line - * * \* \* any number of the previous atom - ~ ~ \~ \~ latest substitute string - () \(\) \(\) \(\) group as an atom - | \| \| \| nothing: separates alternatives - \\ \\ \\ \\ literal backslash - \{ { { { literal curly brace - -{only MNV supports \m, \M, \v and \V} - -If you want to you can make a pattern immune to the 'magic' option being set -or not by putting "\m" or "\M" at the start of the pattern. - -============================================================================== -4. Overview of pattern items *pattern-overview* - *E865* *E866* *E867* *E869* - -Overview of multi items. */multi* *E61* *E62* -More explanation and examples below, follow the links. *E64* *E871* - - multi ~ - 'magic' 'nomagic' matches of the preceding atom ~ -|/star| * \* 0 or more as many as possible -|/\+| \+ \+ 1 or more as many as possible -|/\=| \= \= 0 or 1 as many as possible -|/\?| \? \? 0 or 1 as many as possible - -|/\{| \{n,m} \{n,m} n to m as many as possible - \{n} \{n} n exactly - \{n,} \{n,} at least n as many as possible - \{,m} \{,m} 0 to m as many as possible - \{} \{} 0 or more as many as possible (same as *) - -|/\{-| \{-n,m} \{-n,m} n to m as few as possible - \{-n} \{-n} n exactly - \{-n,} \{-n,} at least n as few as possible - \{-,m} \{-,m} 0 to m as few as possible - \{-} \{-} 0 or more as few as possible - - *E59* -|/\@>| \@> \@> 1, like matching a whole pattern -|/\@=| \@= \@= nothing, requires a match |/zero-width| -|/\@!| \@! \@! nothing, requires NO match |/zero-width| -|/\@<=| \@<= \@<= nothing, requires a match behind |/zero-width| -|/\@<!| \@<! \@<! nothing, requires NO match behind |/zero-width| - - -Overview of ordinary atoms. */ordinary-atom* -More explanation and examples below, follow the links. - - ordinary atom ~ - magic nomagic matches ~ -|/^| ^ ^ start-of-line (at start of pattern) |/zero-width| -|/\^| \^ \^ literal '^' -|/\_^| \_^ \_^ start-of-line (used anywhere) |/zero-width| -|/$| $ $ end-of-line (at end of pattern) |/zero-width| -|/\$| \$ \$ literal '$' -|/\_$| \_$ \_$ end-of-line (used anywhere) |/zero-width| -|/.| . \. any single character (not an end-of-line) -|/\_.| \_. \_. any single character or end-of-line -|/\<| \< \< beginning of a word |/zero-width| -|/\>| \> \> end of a word |/zero-width| -|/\zs| \zs \zs anything, sets start of match -|/\ze| \ze \ze anything, sets end of match -|/\%^| \%^ \%^ beginning of file |/zero-width| *E71* -|/\%$| \%$ \%$ end of file |/zero-width| -|/\%V| \%V \%V inside Visual area |/zero-width| -|/\%#| \%# \%# cursor position |/zero-width| -|/\%'m| \%'m \%'m mark m position |/zero-width| -|/\%l| \%23l \%23l in line 23 |/zero-width| -|/\%c| \%23c \%23c in column 23 |/zero-width| -|/\%v| \%23v \%23v in virtual column 23 |/zero-width| - -Character classes: */character-classes* - magic nomagic matches ~ -|/\i| \i \i identifier character (see 'isident' option) -|/\I| \I \I like "\i", but excluding digits -|/\k| \k \k keyword character (see 'iskeyword' option) -|/\K| \K \K like "\k", but excluding digits -|/\f| \f \f file name character (see 'isfname' option) -|/\F| \F \F like "\f", but excluding digits -|/\p| \p \p printable character (see 'isprint' option) -|/\P| \P \P like "\p", but excluding digits -|/\s| \s \s whitespace character: <Space> and <Tab> -|/\S| \S \S non-whitespace character; opposite of \s -|/\d| \d \d digit: [0-9] -|/\D| \D \D non-digit: [^0-9] -|/\x| \x \x hex digit: [0-9A-Fa-f] -|/\X| \X \X non-hex digit: [^0-9A-Fa-f] -|/\o| \o \o octal digit: [0-7] -|/\O| \O \O non-octal digit: [^0-7] -|/\w| \w \w word character: [0-9A-Za-z_] -|/\W| \W \W non-word character: [^0-9A-Za-z_] -|/\h| \h \h head of word character: [A-Za-z_] -|/\H| \H \H non-head of word character: [^A-Za-z_] -|/\a| \a \a alphabetic character: [A-Za-z] -|/\A| \A \A non-alphabetic character: [^A-Za-z] -|/\l| \l \l lowercase character: [a-z] -|/\L| \L \L non-lowercase character: [^a-z] -|/\u| \u \u uppercase character: [A-Z] -|/\U| \U \U non-uppercase character [^A-Z] -|/\_| \_x \_x where x is any of the characters above: character - class with end-of-line included -(end of character classes) - - magic nomagic matches ~ -|/\e| \e \e <Esc> -|/\t| \t \t <Tab> -|/\r| \r \r <CR> -|/\b| \b \b <BS> -|/\n| \n \n end-of-line -|/~| ~ \~ last given substitute string -|/\1| \1 \1 same string as matched by first \(\) -|/\2| \2 \2 Like "\1", but uses second \(\) - ... -|/\9| \9 \9 Like "\1", but uses ninth \(\) - *E68* -|/\z1| \z1 \z1 only for syntax highlighting, see |:syn-ext-match| - ... -|/\z1| \z9 \z9 only for syntax highlighting, see |:syn-ext-match| - - x x a character with no special meaning matches itself - -|/[]| [] \[] any character specified inside the [] -|/\%[]| \%[] \%[] a sequence of optionally matched atoms - -|/\c| \c \c ignore case, do not use the 'ignorecase' option -|/\C| \C \C match case, do not use the 'ignorecase' option -|/\Z| \Z \Z ignore differences in Unicode "combining characters". - Useful when searching voweled Hebrew or Arabic text. - - magic nomagic matches ~ -|/\m| \m \m 'magic' on for the following chars in the pattern -|/\M| \M \M 'magic' off for the following chars in the pattern -|/\v| \v \v the following chars in the pattern are "very magic" -|/\V| \V \V the following chars in the pattern are "very nomagic" -|/\%#=| \%#=1 \%#=1 select regexp engine |/zero-width| - -|/\%d| \%d \%d match specified decimal character (eg \%d123) -|/\%x| \%x \%x match specified hex character (eg \%x2a) -|/\%o| \%o \%o match specified octal character (eg \%o040) -|/\%u| \%u \%u match specified multibyte character (eg \%u20ac) -|/\%U| \%U \%U match specified large multibyte character (eg - \%U12345678) -|/\%C| \%C \%C match any composing characters - -Example matches ~ -\<\I\i* or -\<\h\w* -\<[a-zA-Z_][a-zA-Z0-9_]* - An identifier (e.g., in a C program). - -\(\.$\|\. \) A period followed by <EOL> or a space. - -[.!?][])"']*\($\|[ ]\) A search pattern that finds the end of a sentence, - with almost the same definition as the ")" command. - -cat\Z Both "cat" and "càt" ("a" followed by 0x0300) - Does not match "càt" (character 0x00e0), even - though it may look the same. - - -============================================================================== -5. Multi items *pattern-multi-items* - -An atom can be followed by an indication of how many times the atom can be -matched and in what way. This is called a multi. See |/multi| for an -overview. - - */star* */\star* -* (use \* when 'magic' is not set) - Matches 0 or more of the preceding atom, as many as possible. - Example 'nomagic' matches ~ - a* a\* "", "a", "aa", "aaa", etc. - .* \.\* anything, also an empty string, no end-of-line - \_.* \_.\* everything up to the end of the buffer - \_.*END \_.\*END everything up to and including the last "END" - in the buffer - - Exception: When "*" is used at the start of the pattern or just after - "^" it matches the star character. - - Be aware that repeating "\_." can match a lot of text and take a long - time. For example, "\_.*END" matches all text from the current - position to the last occurrence of "END" in the file. Since the "*" - will match as many as possible, this first skips over all lines until - the end of the file and then tries matching "END", backing up one - character at a time. - - */\+* -\+ Matches 1 or more of the preceding atom, as many as possible. - Example matches ~ - ^.\+$ any non-empty line - \s\+ white space of at least one character - - */\=* -\= Matches 0 or 1 of the preceding atom, as many as possible. - Example matches ~ - foo\= "fo" and "foo" - - */\?* -\? Just like \=. Cannot be used when searching backwards with the "?" - command. - - */\{* *E60* *E554* *E870* -\{n,m} Matches n to m of the preceding atom, as many as possible -\{n} Matches n of the preceding atom -\{n,} Matches at least n of the preceding atom, as many as possible -\{,m} Matches 0 to m of the preceding atom, as many as possible -\{} Matches 0 or more of the preceding atom, as many as possible (like *) - */\{-* -\{-n,m} matches n to m of the preceding atom, as few as possible -\{-n} matches n of the preceding atom -\{-n,} matches at least n of the preceding atom, as few as possible -\{-,m} matches 0 to m of the preceding atom, as few as possible -\{-} matches 0 or more of the preceding atom, as few as possible - - n and m are positive decimal numbers or zero - *non-greedy* - If a "-" appears immediately after the "{", then a shortest match - first algorithm is used (see example below). In particular, "\{-}" is - the same as "*" but uses the shortest match first algorithm. BUT: A - match that starts earlier is preferred over a shorter match: "a\{-}b" - matches "aaab" in "xaaab". - - Example matches ~ - ab\{2,3}c "abbc" or "abbbc" - a\{5} "aaaaa" - ab\{2,}c "abbc", "abbbc", "abbbbc", etc. - ab\{,3}c "ac", "abc", "abbc" or "abbbc" - a[bc]\{3}d "abbbd", "abbcd", "acbcd", "acccd", etc. - a\(bc\)\{1,2}d "abcd" or "abcbcd" - a[bc]\{-}[cd] "abc" in "abcd" - a[bc]*[cd] "abcd" in "abcd" - - The } may optionally be preceded with a backslash: \{n,m\}. - - */\@=* -\@= Matches the preceding atom with zero width. - Like "(?=pattern)" in Perl. - Example matches ~ - foo\(bar\)\@= "foo" in "foobar" - foo\(bar\)\@=foo nothing - */zero-width* - When using "\@=" (or "^", "$", "\<", "\>") no characters are included - in the match. These items are only used to check if a match can be - made. This can be tricky, because a match with following items will - be done in the same position. The last example above will not match - "foobarfoo", because it tries match "foo" in the same position where - "bar" matched. - - Note that using "\&" works the same as using "\@=": "foo\&.." is the - same as "\(foo\)\@=..". But using "\&" is easier, you don't need the - parentheses. - - - */\@!* -\@! Matches with zero width if the preceding atom does NOT match at the - current position. |/zero-width| - Like "(?!pattern)" in Perl. - Example matches ~ - foo\(bar\)\@! any "foo" not followed by "bar" - a.\{-}p\@! "a", "ap", "app", "appp", etc. not immediately - followed by a "p" - if \(\(then\)\@!.\)*$ "if " not followed by "then" - - Using "\@!" is tricky, because there are many places where a pattern - does not match. "a.*p\@!" will match from an "a" to the end of the - line, because ".*" can match all characters in the line and the "p" - doesn't match at the end of the line. "a.\{-}p\@!" will match any - "a", "ap", "app", etc. that isn't followed by a "p", because the "." - can match a "p" and "p\@!" doesn't match after that. - - You can't use "\@!" to look for a non-match before the matching - position: "\(foo\)\@!bar" will match "bar" in "foobar", because at the - position where "bar" matches, "foo" does not match. To avoid matching - "foobar" you could use "\(foo\)\@!...bar", but that doesn't match a - bar at the start of a line. Use "\(foo\)\@<!bar". - - Useful example: to find "foo" in a line that does not contain "bar": > - /^\%(.*bar\)\@!.*\zsfoo -< This pattern first checks that there is not a single position in the - line where "bar" matches. If ".*bar" matches somewhere the \@! will - reject the pattern. When there is no match any "foo" will be found. - The "\zs" is to have the match start just before "foo". - - */\@<=* -\@<= Matches with zero width if the preceding atom matches just before what - follows. |/zero-width| - Like "(?<=pattern)" in Perl, but MNV allows non-fixed-width patterns. - Example matches ~ - \(an\_s\+\)\@<=file "file" after "an" and white space or an - end-of-line - For speed it's often much better to avoid this multi. Try using "\zs" - instead |/\zs|. To match the same as the above example: - an\_s\+\zsfile - At least set a limit for the look-behind, see below. - - "\@<=" and "\@<!" check for matches just before what follows. - Theoretically these matches could start anywhere before this position. - But to limit the time needed, only the line where what follows matches - is searched, and one line before that (if there is one). This should - be sufficient to match most things and not be too slow. - - In the old regexp engine the part of the pattern after "\@<=" and - "\@<!" are checked for a match first, thus things like "\1" don't work - to reference \(\) inside the preceding atom. It does work the other - way around: - Bad example matches ~ - \%#=1\1\@<=,\([a-z]\+\) ",abc" in "abc,abc" - - However, the new regexp engine works differently, it is better to not - rely on this behavior, do not use \@<= if it can be avoided: - Example matches ~ - \([a-z]\+\)\zs,\1 ",abc" in "abc,abc" - -\@123<= - Like "\@<=" but only look back 123 bytes. This avoids trying lots - of matches that are known to fail and make executing the pattern very - slow. Example, check if there is a "<" just before "span": - /<\@1<=span - This will try matching "<" only one byte before "span", which is the - only place that works anyway. - After crossing a line boundary, the limit is relative to the end of - the line. Thus the characters at the start of the line with the match - are not counted (this is just to keep it simple). - The number zero is the same as no limit. - - */\@<!* -\@<! Matches with zero width if the preceding atom does NOT match just - before what follows. Thus this matches if there is no position in the - current or previous line where the atom matches such that it ends just - before what follows. |/zero-width| - Like "(?<!pattern)" in Perl, but MNV allows non-fixed-width patterns. - The match with the preceding atom is made to end just before the match - with what follows, thus an atom that ends in ".*" will work. - Warning: This can be slow (because many positions need to be checked - for a match). Use a limit if you can, see below. - Example matches ~ - \(foo\)\@<!bar any "bar" that's not in "foobar" - \(\/\/.*\)\@<!in "in" which is not after "//" - -\@123<! - Like "\@<!" but only look back 123 bytes. This avoids trying lots of - matches that are known to fail and make executing the pattern very - slow. - - */\@>* -\@> Matches the preceding atom like matching a whole pattern. - Like "(?>pattern)" in Perl. - Example matches ~ - \(a*\)\@>a nothing (the "a*" takes all the "a"'s, there can't be - another one following) - - This matches the preceding atom as if it was a pattern by itself. If - it doesn't match, there is no retry with shorter sub-matches or - anything. Observe this difference: "a*b" and "a*ab" both match - "aaab", but in the second case the "a*" matches only the first two - "a"s. "\(a*\)\@>ab" will not match "aaab", because the "a*" matches - the "aaa" (as many "a"s as possible), thus the "ab" can't match. - - -============================================================================== -6. Ordinary atoms *pattern-atoms* - -An ordinary atom can be: - - */^* -^ At beginning of pattern or after "\|", "\(", "\%(" or "\n": matches - start-of-line; at other positions, matches literal '^'. |/zero-width| - Example matches ~ - ^beep( the start of the C function "beep" (probably). - - */\^* -\^ Matches literal '^'. Can be used at any position in the pattern, but - not inside []. - - */\_^* -\_^ Matches start-of-line. |/zero-width| Can be used at any position in - the pattern, but not inside []. - Example matches ~ - \_s*\_^foo white space and blank lines and then "foo" at - start-of-line - - */$* -$ At end of pattern or in front of "\|", "\)" or "\n" ('magic' on): - matches end-of-line <EOL>; at other positions, matches literal '$'. - |/zero-width| - - */\$* -\$ Matches literal '$'. Can be used at any position in the pattern, but - not inside []. - - */\_$* -\_$ Matches end-of-line. |/zero-width| Can be used at any position in the - pattern, but not inside []. Note that "a\_$b" never matches, since - "b" cannot match an end-of-line. Use "a\nb" instead |/\n|. - Example matches ~ - foo\_$\_s* "foo" at end-of-line and following white space and - blank lines - -. (with 'nomagic': \.) */.* */\.* - Matches any single character, but not an end-of-line. - - */\_.* -\_. Matches any single character or end-of-line. - Careful: "\_.*" matches all text to the end of the buffer! - - */\<* -\< Matches the beginning of a word: The next char is the first char of a - word. The 'iskeyword' option specifies what is a word character. - |/zero-width| - - */\>* -\> Matches the end of a word: The previous char is the last char of a - word. The 'iskeyword' option specifies what is a word character. - |/zero-width| - - */\zs* -\zs Matches at any position, but not inside [], and sets the start of the - match there: The next char is the first char of the whole match. - |/zero-width| - Example: > - /^\s*\zsif -< matches an "if" at the start of a line, ignoring white space. - Can be used multiple times, the last one encountered in a matching - branch is used. Example: > - /\(.\{-}\zsFab\)\{3} -< Finds the third occurrence of "Fab". - This cannot be followed by a multi. *E888* - {not available when compiled without the |+syntax| feature} - */\ze* -\ze Matches at any position, but not inside [], and sets the end of the - match there: The previous char is the last char of the whole match. - |/zero-width| - Can be used multiple times, the last one encountered in a matching - branch is used. - Example: "end\ze\(if\|for\)" matches the "end" in "endif" and - "endfor". - This cannot be followed by a multi. |E888| - {not available when compiled without the |+syntax| feature} - - */\%^* *start-of-file* -\%^ Matches start of the file. When matching with a string, matches the - start of the string. - For example, to find the first "MNV" in a file: > - /\%^\_.\{-}\zsMNV -< - */\%$* *end-of-file* -\%$ Matches end of the file. When matching with a string, matches the - end of the string. - Note that this does NOT find the last "MNV" in a file: > - /MNV\_.\{-}\%$ -< It will find the next MNV, because the part after it will always - match. This one will find the last "MNV" in the file: > - /MNV\ze\(\(MNV\)\@!\_.\)*\%$ -< This uses |/\@!| to ascertain that "MNV" does NOT match in any - position after the first "MNV". - Searching from the end of the file backwards is easier! - - */\%V* -\%V Match inside the Visual area. When Visual mode has already been - stopped match in the area that |gv| would reselect. - This is a |/zero-width| match. To make sure the whole pattern is - inside the Visual area put it at the start and just before the end of - the pattern, e.g.: > - /\%Vfoo.*ba\%Vr -< This also works if only "foo bar" was Visually selected. This: > - /\%Vfoo.*bar\%V -< would match "foo bar" if the Visual selection continues after the "r". - Only works for the current buffer. - - */\%#* *cursor-position* -\%# Matches with the cursor position. Only works when matching in a - buffer displayed in a window. - WARNING: When the cursor is moved after the pattern was used, the - result becomes invalid. MNV doesn't automatically update the matches. - This is especially relevant for syntax highlighting and 'hlsearch'. - In other words: When the cursor moves the display isn't updated for - this change. An update is done for lines which are changed (the whole - line is updated) or when using the |CTRL-L| command (the whole screen - is updated). Example, to highlight the word under the cursor: > - /\k*\%#\k* -< When 'hlsearch' is set and you move the cursor around and make changes - this will clearly show when the match is updated or not. - - */\%'m* */\%<'m* */\%>'m* -\%'m Matches with the position of mark m. -\%<'m Matches before the position of mark m. -\%>'m Matches after the position of mark m. - Example, to highlight the text from mark 's to 'e: > - /.\%>'s.*\%<'e.. -< Note that two dots are required to include mark 'e in the match. That - is because "\%<'e" matches at the character before the 'e mark, and - since it's a |/zero-width| match it doesn't include that character. - WARNING: When the mark is moved after the pattern was used, the result - becomes invalid. MNV doesn't automatically update the matches. - Similar to moving the cursor for "\%#" |/\%#|. - - */\%l* */\%>l* */\%<l* *E951* *E1204* *E1273* -\%23l Matches in a specific line. -\%<23l Matches above a specific line (lower line number). -\%>23l Matches below a specific line (higher line number). -\%.l Matches at the cursor line. -\%<.l Matches above the cursor line. -\%>.l Matches below the cursor line. - These six can be used to match specific lines in a buffer. The "23" - can be any line number. The first line is 1. - WARNING: When inserting or deleting lines MNV does not automatically - update the matches. This means Syntax highlighting quickly becomes - wrong. Also when referring to the cursor position (".") and - the cursor moves the display isn't updated for this change. An update - is done when using the |CTRL-L| command (the whole screen is updated). - Example, to highlight the line where the cursor currently is: > - :exe '/\%' .. line(".") .. 'l' -< Alternatively use: > - /\%.l -< When 'hlsearch' is set and you move the cursor around and make changes - this will clearly show when the match is updated or not. - - */\%c* */\%>c* */\%<c* -\%23c Matches in a specific column. -\%<23c Matches before a specific column. -\%>23c Matches after a specific column. -\%.c Matches at the cursor column. -\%<.c Matches before the cursor column. -\%>.c Matches after the cursor column. - These six can be used to match specific columns in a buffer or string. - The "23" can be any column number. The first column is 1. Actually, - the column is the byte number (thus it's not exactly right for - multibyte characters). - WARNING: When inserting or deleting text MNV does not automatically - update the matches. This means Syntax highlighting quickly becomes - wrong. Also when referring to the cursor position (".") and - the cursor moves the display isn't updated for this change. An update - is done when using the |CTRL-L| command (the whole screen is updated). - Example, to highlight the column where the cursor currently is: > - :exe '/\%' .. col(".") .. 'c' -< Alternatively use: > - /\%.c -< When 'hlsearch' is set and you move the cursor around and make changes - this will clearly show when the match is updated or not. - Example for matching a single byte in column 44: > - /\%>43c.\%<46c -< Note that "\%<46c" matches in column 45 when the "." matches a byte in - column 44. - */\%v* */\%>v* */\%<v* -\%23v Matches in a specific virtual column. -\%<23v Matches before a specific virtual column. -\%>23v Matches after a specific virtual column. -\%.v Matches at the current virtual column. -\%<.v Matches before the current virtual column. -\%>.v Matches after the current virtual column. - These six can be used to match specific virtual columns in a buffer or - string. When not matching with a buffer in a window, the option - values of the current window are used (e.g., 'tabstop'). - The "23" can be any column number. The first column is 1. - Note that some virtual column positions will never match, because they - are halfway through a tab or other character that occupies more than - one screen character. - WARNING: When inserting or deleting text MNV does not automatically - update highlighted matches. This means Syntax highlighting quickly - becomes wrong. Also when referring to the cursor position (".") and - the cursor moves the display isn't updated for this change. An update - is done when using the |CTRL-L| command (the whole screen is updated). - Example, to highlight all the characters after virtual column 72: > - /\%>72v.* -< When 'hlsearch' is set and you move the cursor around and make changes - this will clearly show when the match is updated or not. - To match the text up to column 17: > - /^.*\%17v -< To match all characters after the current virtual column (where the - cursor is): > - /\%>.v.* -< Column 17 is not included, because this is a |/zero-width| match. To - include the column use: > - /^.*\%17v. -< This command does the same thing, but also matches when there is no - character in column 17: > - /^.*\%<18v. -< Note that without the "^" to anchor the match in the first column, - this will also highlight column 17: > - /.*\%17v -< Column 17 is highlighted by 'hlsearch' because there is another match - where ".*" matches zero characters. - - -Character classes: -\i identifier character (see 'isident' option) */\i* -\I like "\i", but excluding digits */\I* -\k keyword character (see 'iskeyword' option) */\k* -\K like "\k", but excluding digits */\K* -\f file name character (see 'isfname' option) */\f* -\F like "\f", but excluding digits */\F* -\p printable character (see 'isprint' option) */\p* -\P like "\p", but excluding digits */\P* - -NOTE: the above also work for multibyte characters. The ones below only -match ASCII characters, as indicated by the range. - - *whitespace* *white-space* -\s whitespace character: <Space> and <Tab> */\s* -\S non-whitespace character; opposite of \s */\S* -\d digit: [0-9] */\d* -\D non-digit: [^0-9] */\D* -\x hex digit: [0-9A-Fa-f] */\x* -\X non-hex digit: [^0-9A-Fa-f] */\X* -\o octal digit: [0-7] */\o* -\O non-octal digit: [^0-7] */\O* -\w word character: [0-9A-Za-z_] */\w* -\W non-word character: [^0-9A-Za-z_] */\W* -\h head of word character: [A-Za-z_] */\h* -\H non-head of word character: [^A-Za-z_] */\H* -\a alphabetic character: [A-Za-z] */\a* -\A non-alphabetic character: [^A-Za-z] */\A* -\l lowercase character: [a-z] */\l* -\L non-lowercase character: [^a-z] */\L* -\u uppercase character: [A-Z] */\u* -\U non-uppercase character: [^A-Z] */\U* - - NOTE: Using the atom is faster than the [] form. - - NOTE: 'ignorecase', "\c" and "\C" are not used by character classes. - - */\_* *E63* */\_i* */\_I* */\_k* */\_K* */\_f* */\_F* - */\_p* */\_P* */\_s* */\_S* */\_d* */\_D* */\_x* */\_X* - */\_o* */\_O* */\_w* */\_W* */\_h* */\_H* */\_a* */\_A* - */\_l* */\_L* */\_u* */\_U* -\_x Where "x" is any of the characters above: The character class with - end-of-line added -(end of character classes) - -\e matches <Esc> */\e* -\t matches <Tab> */\t* -\r matches <CR> */\r* -\b matches <BS> */\b* -\n matches an end-of-line */\n* - When matching in a string instead of buffer text a literal newline - character is matched. - -~ matches the last given substitute string */~* */\~* - -\(\) A pattern enclosed by escaped parentheses. */\(* */\(\)* */\)* - E.g., "\(^a\)" matches 'a' at the start of a line. - There can only be nine of these. You can use "\%(" to add more, but - not counting it as a sub-expression. - *E51* *E54* *E55* *E872* *E873* - -\1 Matches the same string that was matched by */\1* *E65* - the first sub-expression in \( and \). - Example: "\([a-z]\).\1" matches "ata", "ehe", "tot", etc. -\2 Like "\1", but uses second sub-expression, */\2* - ... */\3* -\9 Like "\1", but uses ninth sub-expression. */\9* - Note: The numbering of groups is done based on which "\(" comes first - in the pattern (going left to right), NOT based on what is matched - first. - -\%(\) A pattern enclosed by escaped parentheses. */\%(\)* */\%(* *E53* - Just like \(\), but without counting it as a sub-expression. This - allows using more groups and it's a little bit faster. - -x A single character, with no special meaning, matches itself - - */\* */\\* -\x A backslash followed by a single character, with no special meaning, - is reserved for future expansions - -[] (with 'nomagic': \[]) */[]* */\[]* */\_[]* */collection* *E76* -\_[] - A collection. This is a sequence of characters enclosed in square - brackets. It matches any single character in the collection. - Example matches ~ - [xyz] any 'x', 'y' or 'z' - [a-zA-Z]$ any alphabetic character at the end of a line - \c[a-z]$ same - [А-яЁё] Russian alphabet (with utf-8 and cp1251) - - */[\n]* - With "\_" prepended the collection also includes the end-of-line. - The same can be done by including "\n" in the collection. The - end-of-line is also matched when the collection starts with "^"! Thus - "\_[^ab]" matches the end-of-line and any character but "a" and "b". - This makes it Vi compatible: Without the "\_" or "\n" the collection - does not match an end-of-line. - *E769* - When the ']' is not there MNV will not give an error message but - assume no collection is used. Useful to search for '['. However, you - do get E769 for internal searching. And be aware that in a - `:substitute` command the whole command becomes the pattern. E.g. - ":s/[/x/" searches for "[/x" and replaces it with nothing. It does - not search for "[" and replaces it with "x"! - - *E944* *E945* - If the sequence begins with "^", it matches any single character NOT - in the collection: "[^xyz]" matches anything but 'x', 'y' and 'z'. - - If two characters in the sequence are separated by '-', this is - shorthand for the full list of ASCII characters between them. E.g., - "[0-9]" matches any decimal digit. If the starting character - exceeds the ending character, e.g. [c-a], E944 occurs. Non-ASCII - characters can be used, but the character values must not be more - than 256 apart in the old regexp engine. For example, searching by - [\u3000-\u4000] after setting re=1 emits a E945 error. Prepending - \%#=2 will fix it. - - A character class expression is evaluated to the set of characters - belonging to that character class. The following character classes - are supported: - Name Func Contents ~ -*[:alnum:]* [:alnum:] isalnum ASCII letters and digits -*[:alpha:]* [:alpha:] isalpha ASCII letters -*[:blank:]* [:blank:] space and tab -*[:cntrl:]* [:cntrl:] iscntrl ASCII control characters -*[:digit:]* [:digit:] decimal digits '0' to '9' -*[:graph:]* [:graph:] isgraph ASCII printable characters excluding - space -*[:lower:]* [:lower:] (1) lowercase letters (all letters when - 'ignorecase' is used and the old - engine is in use |two-engines|) -*[:print:]* [:print:] (2) printable characters including space -*[:punct:]* [:punct:] ispunct ASCII punctuation characters -*[:space:]* [:space:] whitespace characters: space, tab, CR, - NL, vertical tab, form feed -*[:upper:]* [:upper:] (3) uppercase letters (all letters when - 'ignorecase' is used and the old - engine is in use |two-engines|) -*[:xdigit:]* [:xdigit:] hexadecimal digits: 0-9, a-f, A-F -*[:return:]* [:return:] the <CR> character -*[:tab:]* [:tab:] the <Tab> character -*[:escape:]* [:escape:] the <Esc> character -*[:backspace:]* [:backspace:] the <BS> character -*[:ident:]* [:ident:] identifier character (same as "\i") -*[:keyword:]* [:keyword:] keyword character (same as "\k") -*[:fname:]* [:fname:] file name character (same as "\f") - The square brackets in character class expressions are additional to - the square brackets delimiting a collection. For example, the - following is a plausible pattern for a UNIX filename: - "[-./[:alnum:]_~]\+". That is, a list of at least one character, - each of which is either '-', '.', '/', alphabetic, numeric, '_' or - '~'. - These items only work for 8-bit characters, except [:lower:] and - [:upper:] also work for multibyte characters when using the new - regexp engine. See |two-engines|. In the future these items may - work for multibyte characters. For now, to get all "alpha" - characters you can use: [[:lower:][:upper:]]. - - The "Func" column shows what library function is used. The - implementation depends on the system. Otherwise: - (1) Uses islower() for ASCII and MNV builtin rules for other - characters. - (2) Uses MNV builtin rules - (3) As with (1) but using isupper() - */[[=* *[==]* - - An equivalence class. This means that characters are matched that - have almost the same meaning, e.g., when ignoring accents. This - only works for Unicode, latin1 and latin9. The form is: - [=a=] - */[[.* *[..]* - - A collation element. This currently simply accepts a single - character in the form: - [.a.] - */\]* - - To include a literal ']', '^', '-' or '\' in the collection, put a - backslash before it: "[xyz\]]", "[\^xyz]", "[xy\-z]" and "[xyz\\]". - (Note: POSIX does not support the use of a backslash this way). For - ']' you can also make it the first character (following a possible - "^"): "[]xyz]" or "[^]xyz]". - For '-' you can also make it the first or last character: "[-xyz]", - "[^-xyz]" or "[xyz-]". For '\' you can also let it be followed by - any character that's not in "^]-\bdertnoUux". "[\xyz]" matches '\', - 'x', 'y' and 'z'. It's better to use "\\" though, future expansions - may use other characters after '\'. - - Omitting the trailing ] is not considered an error. "[]" works like - "[]]", it matches the ']' character. - - The following translations are accepted when the 'l' flag is not - included in 'cpoptions': - \e <Esc> - \t <Tab> - \r <CR> (NOT end-of-line!) - \b <BS> - \n line break, see above |/[\n]| - \d123 decimal number of character - \o40 octal number of character up to 0o377 - \x20 hexadecimal number of character up to 0xff - \u20AC hex. number of multibyte character up to 0xffff - \U1234 hex. number of multibyte character up to 8 characters - 0xffffffff |E1541| - NOTE: The other backslash codes mentioned above do not work inside - []! - - Matching with a collection can be slow, because each character in - the text has to be compared with each character in the collection. - Use one of the other atoms above when possible. Example: "\d" is - much faster than "[0-9]" and matches the same characters. However, - the new |NFA| regexp engine deals with this better than the old one. - - */\%[]* *E69* *E70* *E369* -\%[] A sequence of optionally matched atoms. This always matches. - It matches as much of the list of atoms it contains as possible. Thus - it stops at the first atom that doesn't match. For example: > - /r\%[ead] -< matches "r", "re", "rea" or "read". The longest that matches is used. - To match the Ex command "function", where "fu" is required and - "nction" is optional, this would work: > - /\<fu\%[nction]\> -< The end-of-word atom "\>" is used to avoid matching "fu" in "full". - It gets more complicated when the atoms are not ordinary characters. - You don't often have to use it, but it is possible. Example: > - /\<r\%[[eo]ad]\> -< Matches the words "r", "re", "ro", "rea", "roa", "read" and "road". - There can be no \(\), \%(\) or \z(\) items inside the [] and \%[] does - not nest. - To include a "[" use "[[]" and for "]" use []]", e.g.,: > - /index\%[[[]0[]]] -< matches "index" "index[", "index[0" and "index[0]". - {not available when compiled without the |+syntax| feature} - - */\%d* */\%x* */\%o* */\%u* */\%U* *E678* - -\%d123 Matches the character specified with a decimal number. Must be - followed by a non-digit. -\%o40 Matches the character specified with an octal number up to 0o377. - Numbers below 0o40 must be followed by a non-octal digit or a - non-digit. -\%x2a Matches the character specified with up to two hexadecimal characters. -\%u20AC Matches the character specified with up to four hexadecimal - characters. -\%U1234abcd Matches the character specified with up to eight hexadecimal - characters, up to 0x7fffffff (the maximum allowed value is INT_MAX - |E1541|, but the maximum valid Unicode codepoint is U+10FFFF). - -============================================================================== -7. Ignoring case in a pattern */ignorecase* - -If the 'ignorecase' option is on, the case of normal letters is ignored. -'smartcase' can be set to ignore case when the pattern contains lowercase -letters only. - */\c* */\C* -When "\c" appears anywhere in the pattern, the whole pattern is handled like -'ignorecase' is on. The actual value of 'ignorecase' and 'smartcase' is -ignored. "\C" does the opposite: Force matching case for the whole pattern. -{only MNV supports \c and \C} -Note that 'ignorecase', "\c" and "\C" are not used for the character classes. - -Examples: - pattern 'ignorecase' 'smartcase' matches ~ - foo off - foo - foo on - foo Foo FOO - Foo on off foo Foo FOO - Foo on on Foo - \cfoo - - foo Foo FOO - foo\C - - foo - -Technical detail: *NL-used-for-Nul* -<Nul> characters in the file are stored as <NL> in memory. In the display -they are shown as "^@". The translation is done when reading and writing -files. To match a <Nul> with a search pattern you can just enter CTRL-@ or -"CTRL-V 000". This is probably just what you expect. Internally the -character is replaced with a <NL> in the search pattern. What is unusual is -that typing CTRL-V CTRL-J also inserts a <NL>, thus also searches for a <Nul> -in the file. - - *CR-used-for-NL* -When 'fileformat' is "mac", <NL> characters in the file are stored as <CR> -characters internally. In the text they are shown as "^J". Otherwise this -works similar to the usage of <NL> for a <Nul>. - -When working with expression evaluation, a <NL> character in the pattern -matches a <NL> in the string. The use of "\n" (backslash n) to match a <NL> -doesn't work there, it only works to match text in the buffer. - - *pattern-multi-byte* *pattern-multibyte* -Patterns will also work with multibyte characters, mostly as you would -expect. But invalid bytes may cause trouble, a pattern with an invalid byte -will probably never match. - -============================================================================== -8. Composing characters *patterns-composing* - - */\Z* -When "\Z" appears anywhere in the pattern, all composing characters are -ignored. Thus only the base characters need to match, the composing -characters may be different and the number of composing characters may differ. -Only relevant when 'encoding' is "utf-8". -Exception: If the pattern starts with one or more composing characters, these -must match. - */\%C* -Use "\%C" to skip any composing characters. For example, the pattern "a" does -not match in "càt" (where the a has the composing character 0x0300), but -"a\%C" does. Note that this does not match "cát" (where the á is character -0xe1, it does not have a compositing character). It does match "cat" (where -the a is just an a). - -When a composing character appears at the start of the pattern or after an -item that doesn't include the composing character, a match is found at any -character that includes this composing character. - -When using a dot and a composing character, this works the same as the -composing character by itself, except that it doesn't matter what comes before -this. - -The order of composing characters does not matter. Also, the text may have -more composing characters than the pattern, it still matches. But all -composing characters in the pattern must be found in the text. - -Suppose B is a base character and x and y are composing characters: - pattern text match ~ - Bxy Bxy yes (perfect match) - Bxy Byx yes (order ignored) - Bxy By no (x missing) - Bxy Bx no (y missing) - Bx Bx yes (perfect match) - Bx By no (x missing) - Bx Bxy yes (extra y ignored) - Bx Byx yes (extra y ignored) - -============================================================================== -9. Compare with Perl patterns *perl-patterns* - -MNV's regexes are most similar to Perl's, in terms of what you can do. The -difference between them is mostly just notation; here's a summary of where -they differ: - -Capability in MNVspeak in Perlspeak ~ ----------------------------------------------------------------- -force case insensitivity \c (?i) -force case sensitivity \C (?-i) -backref-less grouping \%(atom\) (?:atom) -conservative quantifiers \{-n,m} *?, +?, ??, {}? -0-width match atom\@= (?=atom) -0-width non-match atom\@! (?!atom) -0-width preceding match atom\@<= (?<=atom) -0-width preceding non-match atom\@<! (?<!atom) -match without retry atom\@> (?>atom) - -MNV and Perl handle newline characters inside a string a bit differently: - -In Perl, ^ and $ only match at the very beginning and end of the text, -by default, but you can set the 'm' flag, which lets them match at -embedded newlines as well. You can also set the 's' flag, which causes -a . to match newlines as well. (Both these flags can be changed inside -a pattern using the same syntax used for the i flag above, BTW.) - -On the other hand, MNV's ^ and $ always match at embedded newlines, and -you get two separate atoms, \%^ and \%$, which only match at the very -start and end of the text, respectively. MNV solves the second problem -by giving you the \_ "modifier": put it in front of a . or a character -class, and they will match newlines as well. - -Finally, these constructs are unique to Perl: -- execution of arbitrary code in the regex: (?{perl code}) -- conditional expressions: (?(condition)true-expr|false-expr) - -...and these are unique to MNV: -- changing the magic-ness of a pattern: \v \V \m \M - (very useful for avoiding backslashitis) -- sequence of optionally matching atoms: \%[atoms] -- \& (which is to \| what "and" is to "or"; it forces several branches - to match at one spot) -- matching lines/columns by number: \%5l \%5c \%5v -- setting the start and end of the match: \zs \ze - -============================================================================== -10. Highlighting matches *match-highlight* - - *syntax-vs-match* - Note that the match highlight mechanism is independent - of |syntax-highlighting|, which is (usually) a buffer-local - highlighting, while matching is window-local, both methods - can be freely mixed. Match highlighting functions give you - a bit more flexibility in when and how to apply, but are - typically only used for temporary highlighting, without strict - rules. Both methods can be used to conceal text. - - Thus the matching functions like |matchadd()| won't consider - syntax rules and functions like |synconcealed()| and the - other way around. - - *:mat* *:match* -:mat[ch] {group} /{pattern}/ - Define a pattern to highlight in the current window. It will - be highlighted with {group}. Example: > - :highlight MyGroup ctermbg=green guibg=green - :match MyGroup /TODO/ -< Instead of // any character can be used to mark the start and - end of the {pattern}. Watch out for using special characters, - such as '"' and '|'. - - {group} must exist at the moment this command is executed. - - The {group} highlighting still applies when a character is - to be highlighted for 'hlsearch', as the highlighting for - matches is given higher priority than that of 'hlsearch'. - Syntax highlighting (see 'syntax') is also overruled by - matches. - - Note that highlighting the last used search pattern with - 'hlsearch' is used in all windows, while the pattern defined - with ":match" only exists in the current window. It is kept - when switching to another buffer. - - 'ignorecase' does not apply, use |/\c| in the pattern to - ignore case. Otherwise case is not ignored. - - 'redrawtime' defines the maximum time searched for pattern - matches. - - When matching end-of-line and MNV redraws only part of the - display you may get unexpected results. That is because MNV - looks for a match in the line where redrawing starts. - - Also see |matcharg()| and |getmatches()|. The former returns - the highlight group and pattern of a previous |:match| - command. The latter returns a list with highlight groups and - patterns defined by both |matchadd()| and |:match|. - - Highlighting matches using |:match| are limited to three - matches (aside from |:match|, |:2match| and |:3match| are - available). |matchadd()| does not have this limitation and in - addition makes it possible to prioritize matches. - - Another example, which highlights all characters in virtual - column 72 and more: > - :highlight rightMargin term=bold ctermfg=blue guifg=blue - :match rightMargin /.\%>72v/ -< To highlight all character that are in virtual column 7: > - :highlight col8 ctermbg=grey guibg=grey - :match col8 /\%<8v.\%>7v/ -< Note the use of two items to also match a character that - occupies more than one virtual column, such as a TAB. - -:mat[ch] -:mat[ch] none - Clear a previously defined match pattern. - - -:2mat[ch] {group} /{pattern}/ *:2match* -:2mat[ch] -:2mat[ch] none -:3mat[ch] {group} /{pattern}/ *:3match* -:3mat[ch] -:3mat[ch] none - Just like |:match| above, but set a separate match. Thus - there can be three matches active at the same time. The match - with the lowest number has priority if several match at the - same position. It uses the match id 3. - The ":3match" command is used by (MNV < 9.0.2054) |matchparen| - plugin. You are suggested to use ":match" for manual matching - and ":2match" for another plugin or even better make use of - the more flexible |matchadd()| (and similar) functions instead. - -============================================================================== -11. Fuzzy matching *fuzzy-matching* - -Fuzzy matching scores how well a string matches a pattern when the pattern -characters appear in order but not necessarily contiguously. - -Example: > - Pattern: "mnv" - Candidates: "mnv" -> perfect - "mnveo" -> good (m n v) - "voice mail" -> weaker (m _ n _ _ _ v) - "vintage" -> no match (no "m") -< -If the search string has multiple words, each word is matched separately and -may appear in any order in the candidate. For example "get pat" matches -"GetPattern", "PatternGet", "getPattern", "patGetter", "getSomePattern", -"MatchpatternGet", etc. - -The 'ignorecase' and 'smartcase' options do not apply, case is ignored if the -pattern is all lower case. - -MNV's implementation is based on the algorithm from the fzy project: -https://github.com/jhawthorn/fzy - -It uses dynamic programming to compute an optimal score for a given pattern -and candidate. - -The algorithm works in two stages: - -1. Forward pass - Scan the candidate left to right, tracking the best score for each - pattern position. Matches score higher when they occur at the start - of the candidate, the start of a word (space, underscore, dash, - camelCase), or directly after the previous match. - -2. Backward pass - Start from the best-scoring end position and step back to find match - positions, ensuring the alignment is optimal. - -MNV extends the original algorithm to support multibyte codepoints, allowing -correct matching for UTF-8 and other encodings. - -Time complexity is O(pattern * candidate). Memory usage is proportional -to the same. - -The |matchfuzzy()| and |matchfuzzypos()| functions perform fuzzy searching in -a List of strings. |matchfuzzy()| returns the matching strings, while -|matchfuzzypos()| returns the matches along with their positions and scores. - -The "f" flag of `:mnvgrep` enables fuzzy matching. - -To enable fuzzy matching for |ins-completion|, add "fuzzy" to the -'completeopt' option. For |cmdline-completion|, add "fuzzy" to the -'wildoptions' option. - - mnv:tw=78:ts=8:noet:ft=help:norl: |
