Difference between revisions of "Appendix"

From TED Notepad
 
 
(31 intermediate revisions by the same user not shown)
Line 1: Line 1:
The meaning of some terms used in this manual is as follows below. Many of them are intuitive; some of them may not be well-known; and some of them are used here, only to describe exact actions of some tools within TED Notepad.</p>
+
<noinclude>{{manversion|6.0.2.0}}__NOTOC__</noinclude>
  
*A {{definition|white-space}} is a Space, a Tab or another character that can not be ''seen'' but ''takes place'' in the document. All other characters, which can be ''seen'', are called {{definition|graphs}}.
+
The meaning of some terms used in this manual is as follows below:
  
*An {{definition|alphanum}}<sup>*</sup> is an alpha-numeric character (ie. a, b, ..., z, A, B, ..., Z, 0, 1, ..., 9).
 
  
*A {{definition|capital}}<sup>*</sup> is any capital letter (ie. A, B, ..., Z).
+
* A {{definition|white-space}} is a Space or a Tab or another character that can not be ''seen'' but provides ''blank visual separator'' in the document. All other characters which can be ''seen'', are called {{definition|graphs}}. A {{definition|blank character}} is also a character that can not be ''seen'' but provides ''blank visual separator'' in the document. All {{definition|white-spaces}} are {{definition|blank characters}}, but some {{defined|control characters}} are {{definition|blank characters}} as well.
  
*Capitals are letters in {{definition|upper letter case}} or simply {{definition|upper case}} letters and their oposites are called {{definition|lower case}} letters and are in {{definition|lower letter case}} or simply in {{definition|lower case}}.
+
* An {{definition|alphanum}} is an {{definition|alpha-numeric character}}, i.e. {{string|a}}, {{string|b}}, ..., {{string|z}}; {{string|A}}, {{string|B}}, ..., {{string|Z}}; {{string|0}}, {{string|1}}, ..., {{string|9}}.
 +
** <small>Special characters like &aacute; (a with acute) belong to {{definition|alphanums}} only in certain locale settings. To be able to recognize these characters as {{definition|alphanums}} you need to use CTYPE category of a locale that supports it. TED Notepad always works with the current system locale settings.</small>
  
*To {{definition|ignore case}} is to ignore differences between {{definition|letter cases}} like {{definition|capitals}} and {{definition|lower case}} letters. When {{definition|ignoring case}}, letter {{string|a}} is equal to letter {{string|A}}, {{string|b}} equal to {{string|B}}, etc. An antonym of {{definition|ignore case}} is to {{definition|match case}} and an operation, that {{definition|matches case}} is {{definition|case sensitive}}.
+
* A {{definition|digit}} is any digit recognized by Unicode, i.e. {{string|1}}, ..., {{string|9}}, but also {{string|&sup1;}}, {{string|&sup2;}}, {{string|&sup3;}}, etc.
  
{{todo}}
+
* A {{definition|capital}} is any capital letter, i.e. {{string|A}}, {{string|B}}, ..., {{string|Z}}. These are called letters in {{definition|upper letter case}} or simply {{definition|upper case}} letters. Their oposites are called {{definition|lower case}} letters and are in {{definition|lower letter case}} or simply in {{definition|lower case}}.
 +
** <small>Special characters like &Aacute; (A with acute) belong to {{definition|capitals}} only in certain locale settings. To be able to recognize these characters as {{definition|capitals}} you need to use CTYPE category of a locale that supports it. TED Notepad always works with the current system locale settings.</small>
  
<p>A <b>string</b> is a sequence of characters. Typically, a <q>string</q> is used to describe a phrase, that a user have written in a dialog. (E.g. <cite>Find what</cite> and <cite>Replace with</cite> <q>strings</q> from <cite>Find</cite>/<cite>Replace</cite> dialogs are always used in <cite>find/replace</cite> mechanism.)</p>
+
* Other types of {{definition|character case}} include {{definition|word capitals}}, where each {{defined|word}} begins with a {{defined|capital}} and continues with {{definition|lower case}} letters; {{definition|first capital}}, where the first letter is a {{defined|capital}} and all others are {{definition|lower case}} letters; and {{definition|mixed case}}, where none of the above {{definition|letter cases}} can be determined.
  
<p>A <b>word</b> is a non-empty sequence of <q>alphanums</q>. Underscores may be optionally included<sup>**</sup> and phrase "<code>hello_world</code>" is then treated as a single <q>word</q>. All characters that a <q>word</q> can consist of are called <b>word letters</b>.</p>
+
* To {{definition|ignore case}} is to ignore differences between {{definition|letter cases}} like {{definition|capitals}} and {{definition|lower case}} letters. When {{definition|ignoring case}}, letter {{string|a}} is equal to letter {{string|A}}, {{string|b}} equal to {{string|B}}, etc. An antonym of {{definition|ignore case}} is to {{definition|match case}} and an operation, that {{definition|match case|matches case}} is {{definition|case sensitive}}.
  
<p>A <b>line</b> is a sequence of characters, where two <q>lines</q> are divided by a "<code>CR/NL</code>" characters sequence. Note, that if <cite>Word Wrap</cite> is turned on, a <q>line</q> may be wrapped, but within tools it will be treated only as a single <q>line</q>. Also note, that a single "<code>NL</code>" or "<code>CR</code>" character do not divide two <q>lines</q>.</p>
+
* To {{definition|mimic character case}} is to try to alter {{definition|character case}} of some text based on {{definition|character case}} of the original. Currently only ''basic'' types of {{defined|character case}} are recognized: {{defined|lower case}}, {{defined|upper case}}, {{defined|word capitals}}, {{defined|first capital}}. Everyting else is considered {{defined|mixed case}}.
  
<p>An <b>empty line</b> is a <q>line</q>, that consists only of <q>white-spaces</q>. Therefore a <b>non-empty line</b> is a <q>line</q>, that contains at least one <q>graph</q> character.</p>
+
* There are also other types of characters recognized by TED notepad:
 +
** A {{definition|punctuation character}} is any character recognized by Unicode as meant for punctuation purposes, e.g. quotation marks.
 +
** A {{definition|control character}} is a character from the very beginning of the ASCII table. These have special meaning and should be either avoided or treated with care.
  
<p>A <b>paragraph</b> is a sequence of <q>non-empty lines</q>. Two <q>paragraphs</q> are then divided by a non-empty sequence of <q>empty lines</q>.</p>
 
  
<p>A <b>sentence</b> is a sequence of characters that begins with a <q>capital</q> and ends with a Dot, a Question mark or an Exclamation mark. Example: "<code>Alice? Who the f... is Alice?</code>" are two <q>sentences</q>, but "<code>Alice? Who the f... Is Alice?</code>" are three <q>sentences</q>.</p>
+
* A {{definition|string}} is a sequence of characters. Typically, such {{definition|string}} is used as a synonym for a ''phrase'' that a user have entered in a dialog. E.g. {{field|Search|Find}} and {{field|Replace|Replace}} {{definition|strings}} from {{dialog|Search and Replace}} dialog are always used in find/replace mechanisms.
  
<p>A <b>column</b> is a sequence of characters on a <q>line</q>. Two <q>columns</q> are divided by any of the <b>column delimiters</b>. A <q>column</q> can not exceed a <q>line</q>. Typically, when a <q>line</q> is divided into logical parts by a special <q>delimiter</q> character (e.g. a Tab character), those parts are called <q>columns</q>. <q>Columns</q> are used to cut out a sub-<q>string</q> from a <q>line</q>.</p>
 
  
<p>A <b>char range</b> is a sub-sequence of characters that begins and ends at the specified positions. Char range is used to cut out a sub-<q>string</q> from a longer <q>column</q>.</p>
+
* A {{definition|word}} is a non-empty sequence of {{definition|alphanums}}. Underscores may optionally be included within words, a phrase like {{string|hello_world}} is then also treated as a single {{definition|word}}. All characters a {{definition|word}} can consist of are called {{definition|word letters}} or {{definition|word characters}}. Other characters are called {{definition|word delimiters}} or {{definition|non-word characters}}. See section [[General page]] of the {{dialog|Settings}} dialog for more information about Underscores in words.
  
<p>An <b>actual insertion point</b> (also called a <b>cursor position</b>) is a position of the caret in the documnet or the end of the actual selection, if any. Note, that in special cases, it is the beginning of the selection, if any. These special cases are tools/features that work backward. (e.g. <cite>Find Previous</cite> or <cite>BkSpace Word</cite>.)</p>
+
* A {{definition|word boundary}} is a {{defined|word}} beginning or {{defined|word}} end. This is the place where one of the characters around is a {{defined|word character}} and the other is either a {{defined|non-word character}} or there is no character at all.
  
<strong>unique</strong>
 
  
 +
* A {{definition|line}} is a sequence of characters, where two {{definition|lines}} are divided by one {{definition|newline}}. Note that if {{feature|Word Wrap}} is turned on, a {{definition|line}} may be visually wrapped into several visual lines, but within all tools and most features it will still be treated as a single unbroken {{definition|line}}. Any current visual word-wrapping has seldom impact on how {{definition|lines}} are treated within tools and features .
  
<small>*: Special characters like á (a with acute) do not belong to <q>alphanums</q>, nor <q>capitals</q>, in English locale settings. To be able to recognize those characters as <q>alphanums</q> and <q>capitals</q>, you have to use CTYPE category of the locale that supports it. TED Notepad always works with the system locale settings.</small><br><br>
+
* An {{definition|empty line}} is a {{definition|line}}, which consists of {{definition|white-spaces}} only. Therefore a {{definition|non-empty line}} is a {{definition|line}}, which contains at least one {{definition|graph}} character. Please note that there might be many {{definition|white-spaces}} and still the {{definition|line}} would be considered {{definition|empty line|empty}}.
  
<small>**: See section <a href="#basic_sett">.. </a>.</small>
+
 
 +
* A {{definition|paragraph}} is a sequence of {{definition|non-empty lines}}. Two {{definition|paragraphs}} are divided by a sequence of {{definition|empty lines}}. There is no such thing as empty {{definition|paragraph}}, since sequences of {{definition|empty lines}} are always grouped together when determining {{definition|paragraphs}}.
 +
 
 +
* A {{definition|sentence}} is a sequence of characters that begins with a {{definition|capital}} and ends with a Dot, a Question mark or an Exclamation mark. Example: {{string|Alice? Who the f... is Alice?}} are two {{definition|sentences}}, but {{string|Alice? Who the f... Is Alice?}} are three {{definition|sentences}}. Unfortunatelly, even {{string|How are you today, Mr. President?}} is considered as two {{definition|sentences}}.
 +
 
 +
 
 +
* A {{definition|line column}} is part of {{defined|line}}, which meets certain {{definition|column criteria}}. As these column criteria are applied to subsequent {{defined|lines}}, they determine a logical {{definition|column}} of text over these lines. The criteria are applied to individual {{defined|lines}} independently, thus possibly determining a column of text, which may be visually hard to identify. Nevertheless, for each individual {{defined|line}}, the column criteria are met.
 +
** Note: A {{definition|line column}} is always one solid line portion, i.e. one line column can never consist of two separate portions of the same line. This is because a {{defined|line column}} is a logical part of {{defined|line}}, it only specifies where it begins and where it ends on each line.
 +
** Applicable column criteria may change from feature to feature and from tool to tool, but they usually include:
 +
*** Dividing each {{defined|line}} into portions using {{definition|delimiting characters}}, also called {{definition|delimiters}}. These delimiting characters are located within each line and the line is split into portions. A splitting point occurs at any of these characters. These portions are numbered. The criteria then specify, which consequtive portions are to be selected for the line column. Note: Delimiting characters enclosing the selected portions are not included within the line column, but any delimiting characters between the selected portions are naturally included.
 +
*** Dividing each {{defined|line}} into portions using a {{definition|delimiting phrase}}. In contrast to the {{defined|delimiting characters}}, delimiting phrase is always located within each line as a whole sequence of characters, not as a set of individual and interchangeable characters. The line is split into numbered portions wherever this whole delimiting phrase is found. The criteria then specify, which consequtive portions are to be selected for the line column. Note: Delimiting phrases enclosing the selected portions are not included within the line column, but any delimiting phrases between the selected portions are naturally included.
 +
*** Taking only a portion of each {{defined|line}} based on a {{definition|range of characters}}. A range of characters is simply a starting and an ending point within the line. All characters between the starting and ending points are selected for the line column.
 +
*** Certain combinations of the above criteria can be used to further reduce the column. For example, a set of {{defined|delimiting characters}} can be used to split the line and select only the second part, and then a {{defined|range of characters}} can be used to further reduce that part at its beginning and/or at its end. Note that these criteria are applied in sequence and their results compound, i.e. later criteria obey prior criteria and never try to reach outside of boundaries set by preceding criteria.
 +
 
 +
 
 +
* An {{definition|actual insertion point}} (also called a {{definition|current caret location}}) is a position of the caret in the documnet. It is also the end of the actual selection, if any. Note that the end of the selection is where the user stops selecting the text, therefore if selecting text upwards, the selection end visually preceeds the selection beginning.
 +
 
 +
 
 +
* To {{definition|unique}} lines is to remove duplicate lines, to unify them. If lines or words have been {{definition|uniqued}}, it means that each line (or word) is unique in the results and that no two lines (or words) are of the same text.

Latest revision as of 15:34, 12 March 2014

This section is up to date for TED Notepad version 6.3.1.0.

The meaning of some terms used in this manual is as follows below:


  • An alphanum is an alpha-numeric character, i.e. a, b, ..., z; A, B, ..., Z; 0, 1, ..., 9.
    • Special characters like á (a with acute) belong to alphanums only in certain locale settings. To be able to recognize these characters as alphanums you need to use CTYPE category of a locale that supports it. TED Notepad always works with the current system locale settings.
  • A digit is any digit recognized by Unicode, i.e. 1, ..., 9, but also ¹, ², ³, etc.
  • A capital is any capital letter, i.e. A, B, ..., Z. These are called letters in upper letter case or simply upper case letters. Their oposites are called lower case letters and are in lower letter case or simply in lower case.
    • Special characters like Á (A with acute) belong to capitals only in certain locale settings. To be able to recognize these characters as capitals you need to use CTYPE category of a locale that supports it. TED Notepad always works with the current system locale settings.
  • There are also other types of characters recognized by TED notepad:
    • A punctuation character is any character recognized by Unicode as meant for punctuation purposes, e.g. quotation marks.
    • A control character is a character from the very beginning of the ASCII table. These have special meaning and should be either avoided or treated with care.




  • A line is a sequence of characters, where two lines are divided by one newline. Note that if Word Wrap is turned on, a line may be visually wrapped into several visual lines, but within all tools and most features it will still be treated as a single unbroken line. Any current visual word-wrapping has seldom impact on how lines are treated within tools and features .


  • A sentence is a sequence of characters that begins with a capital and ends with a Dot, a Question mark or an Exclamation mark. Example: Alice? Who the f... is Alice? are two sentences, but Alice? Who the f... Is Alice? are three sentences. Unfortunatelly, even How are you today, Mr. President? is considered as two sentences.


  • A line column is part of line, which meets certain column criteria. As these column criteria are applied to subsequent lines, they determine a logical column of text over these lines. The criteria are applied to individual lines independently, thus possibly determining a column of text, which may be visually hard to identify. Nevertheless, for each individual line, the column criteria are met.
    • Note: A line column is always one solid line portion, i.e. one line column can never consist of two separate portions of the same line. This is because a line column is a logical part of line, it only specifies where it begins and where it ends on each line.
    • Applicable column criteria may change from feature to feature and from tool to tool, but they usually include:
      • Dividing each line into portions using delimiting characters, also called delimiters. These delimiting characters are located within each line and the line is split into portions. A splitting point occurs at any of these characters. These portions are numbered. The criteria then specify, which consequtive portions are to be selected for the line column. Note: Delimiting characters enclosing the selected portions are not included within the line column, but any delimiting characters between the selected portions are naturally included.
      • Dividing each line into portions using a delimiting phrase. In contrast to the delimiting characters, delimiting phrase is always located within each line as a whole sequence of characters, not as a set of individual and interchangeable characters. The line is split into numbered portions wherever this whole delimiting phrase is found. The criteria then specify, which consequtive portions are to be selected for the line column. Note: Delimiting phrases enclosing the selected portions are not included within the line column, but any delimiting phrases between the selected portions are naturally included.
      • Taking only a portion of each line based on a range of characters. A range of characters is simply a starting and an ending point within the line. All characters between the starting and ending points are selected for the line column.
      • Certain combinations of the above criteria can be used to further reduce the column. For example, a set of delimiting characters can be used to split the line and select only the second part, and then a range of characters can be used to further reduce that part at its beginning and/or at its end. Note that these criteria are applied in sequence and their results compound, i.e. later criteria obey prior criteria and never try to reach outside of boundaries set by preceding criteria.


  • An actual insertion point (also called a current caret location) is a position of the caret in the documnet. It is also the end of the actual selection, if any. Note that the end of the selection is where the user stops selecting the text, therefore if selecting text upwards, the selection end visually preceeds the selection beginning.


  • To unique lines is to remove duplicate lines, to unify them. If lines or words have been uniqued, it means that each line (or word) is unique in the results and that no two lines (or words) are of the same text.