Difference between revisions of "Sort Lines"

From TED Notepad
(Created page with '<noinclude>{{manversion|5.3.0.2}}__NOTOC__</noinclude> ====Sort... (Alt+Ctrl+S)==== Shows {{dialog|Sort Selection}} dialog and asks for {{field|sorting keys}}. Then sorts {{defi…')
 
Line 1: Line 1:
<noinclude>{{manversion|5.3.0.2}}__NOTOC__</noinclude>
+
<noinclude>{{manversion|6.0.0.16|feature}}__NOTOC__</noinclude>
====Sort... (Alt+Ctrl+S)====
+
====Sort.. (Alt+Ctrl+S)====
  
Shows {{dialog|Sort Selection}} dialog and asks for {{field|sorting keys}}. Then sorts {{defined|lines}} of the selection according to the given {{field|keys}}.
+
Sorts {{defined|lines}} of the selection according to the given sorting keys.
  
When lines of the selection are being sorted, they are compared against each other and the {{field|sorting keys}} are supposed to specify, how to do that. You can use up to three such {{field|keys}}, named {{field|At first...}}, {{field|Then...}} and {{field|Finally...}}. If two lines are equal acording to the 1<sup>st</sup> {{field|key}}, then the 2<sup>nd</sup> {{field|key}} is used (and the 3<sup>rd</sup> {{field|key}} alike).
+
Upto five sorting keys can be used, each sorting key represented by its own tab in the {{dialog|Sort Lines}} dialog: {{field|Sort Lines|First}}, {{field|Sort Lines|Second}}, {{field|Sort Lines|Third}}, {{field|Sort Lines|Then}}, and {{field|Sort Lines|Finally}}. Sorting keys are used subsequently ony by one, i.e. if two lines are considered equal acording to the {{field|Sort Lines|First}} key, then the {{field|Sort Lines|Second}} key is used, etc. In other words, to have the {{field|Sort Lines|Finally}} key harnessed/employed, all four preceeding sorting keys must yield indecisive comparison.
  
Note: The {{field|sorting keys}} are used only to sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines are not being modified during the sorting; only the order of them.
+
Note: Each sorting key offer a special {{field|Cutting line portion|Nothing}} option, which turns off the entire key. Such key is not used for comparing lines upon sorting, but it does not prevent later sorting keys from being queried. Using {{field|Cutting line portion|Nothing}} option has the same effect as setting up a sorting key, which would always yield indecisive comparison.
  
Each {{field|sorting key}} may specify:
+
Note: Sorting keys are used only to compare and sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines themselves are not being modified, only their order.
  
; Whether to sort by that key, or not.
+
=====Cutting line portion=====
: There is usually no need to define more than one {{field|key}}, therefore the 2<sup>nd</sup> and 3<sup>rd</sup> keys are disabled by default.
 
: Note: You can not sort by the 3<sup>rd</sup> key, if you are not sorting by the 2<sup>nd</sup> key as well.
 
  
; Whether to use an entire line or only a specific {{field|column}} for the comparisons.
+
Each sorting key may cut a portion of a line, which is used for comparison upon sorting. {{:Cutting line portion}}
:  When lines are being compared, only this {{defined|column}} is taken into account, disregarding any other line content.
 
: Note: The {{defined|column}} is being allocated for each line separately, therefore the total number of columns on a line may vary from line to line. If you run out of columns on a line, you get an empty column for that line.
 
  
:; None, one or more {{field|column delimiters}} may be specified.
+
=====Direction and format=====
:: Each {{defined|delimiting}} character is treated independently and are all {{defined|case sensitive}}. You can, for example, dilimit columns by Spaces and also by Tabs. Then each Space or Tab (whatever comes first) starts a new column.
 
:: Example: A line {{string|HELLO&nbsp;WORLD}} would be:
 
::* divided into two columns ({{string|HELLO}} and {{string|WORLD}}), if delimited by a Space character;
 
::* divided into four columns ({{string|H}}, {{string|LL}}, {{string|&nbsp;W}} and {{string|RLD}}), if delimited by a set of {{string|EO}} characters;
 
::* divided into four columns ({{string|HE}}, {{string|<nowiki>{empty}</nowiki>}}, {{string|O&nbsp;WOR}} and {{string|D}}), if delimited by a single {{string|L}} character;
 
::* and ''divided'' into only one column ({{string|HELLO&nbsp;WORLD}}), if delimited by an {{string|X}} character or if ''delimited'' by no characters.
 
  
:; Whether to {{field|treat a sequence of delimiters as a single delimiter}}.
+
The meaning of each sorting key is specified by {{field|Sort Lines|Sort as}} options.
:: Tells that any sequence of a single delimiter is to be treated as a single delimiter. For example, in a string with several consecutive Spaces, all sequences of Spaces will be grouped together first. Only then the line will be divided into columns.
+
* {{field|Sort Lines|Text}} option tells that each portion of line previously cut is to be treated as a simple string of characters upon comparison &mdash; comparison of two strings of characters is done {{defined|lexicographically}}, i.e. by subsequently comparing each pair of corresponding characters from both strings (the first pair of different characters is the one that decides which string is greater).
:: Note: A sequence of multiple delimiters is not treated as a single delimiter. If a Space is followed by a Tab, both Space and Tab are delimiters, they will delimit three columns instead of two, leaving the second column empty.
+
** Note: {{defined|Character case}} can be ignored upon {{defined|lexicographical}} comparison, as well as entire characters and character types. See below.
:: Example: If this option is checked, a line {{string|HELLO&nbsp;WORLD}} would be divided into only three columns ({{string|HE}}, {{string|O&nbsp;WOR}} and {{string|D}}), if delimited by a single {{string|L}} character, exactly as if there was only one single {{string|L}} in the word {{string|HELLO}}.
+
* {{field|Sort Lines|Numbers}} option tells that each portion of line previously cut is to be treated as a {{defined|number}} upon comparison &mdash; comparison of two numbers is done by their natural order.
 +
** A {{defined|number}} may, in general, consist of a sign, numerals, decimal point and more numerals. Any of those may be omitted, e.g. {{string|3.14}}, {{string|+2.0}}, {{string|2.}}, {{string|-7}}, {{string|.3}} are all valid numbers. Total count of digits in a {{defined|number}} is not limited. Note that individual separate decimal point or positive/negative sign character with no numerals, as well as ''empty'' number, are all treated as 0.
 +
** Note: System locale settings (Regional settings) are used while comparing numbers. This includes decimal point, positive and negative signs, thousands separator, etc.
 +
** Note: All alpha-numeric characters are considered valid numerals, thus hexadecimal numbers (or any other system with radix upto 36) are supported. All numbers must share the same radix, however.
 +
** Note: All  {{defined|white-spaces}} are automatically ignored upon comparing {{defined|numbers}}. {{defined|Character case}} is ignored as well.
 +
** Note: The first unrecognized and non-ignored character after a recognizable {{defined|number}} terminates that {{defined|number}}. Any following text is ignored. Note, however, that all alpha-numeric characters are considered valid numerals and {{defined|white-spaces}} are ignored, thus mystifying {{string|3 pigs}} is a valid number, which is bigger than {{string|3 cows}}, but is far far less than {{string|1 elephant}}. This is because both {{string|3pigs}} and {{string|3cows}} have only 5 digits while {{string|1elephant}} has 9 digits. It is the same as if comparing numbers {{string|37198}}, {{string|32048}} and {{string|131375429}}.
 +
** See chapter [[Appendix]] for further details and examples of recognized {{defined|number formats}}.
  
:; Whether to {{field|count columns backwards}}.
+
Each sorting key can specify the resulting order of sorted lines using the {{field|Sort Lines|Direction}} with two intuitive options: {{field|Sort Lines|Ascending}} or {{field|Sort Lines|Descending}}.
:: If this option is checked, then a column 1 from left is the last column, column 2 from the left is the penultimate column, etc. Note again, that columns are set for each line separately, therefore total number of columns on a line may vary from line to line. If you run out of columns on a line, you get an empty column for that line.
 
  
; Whether to {{field|use only a char range}}.
+
===== options=====
: If this option is checked, then only this {{defined|char range}} of a chosen column is used when sorting lines. Note, that the columns are calculated first. Only then the {{defined|char range}} is cut from the resulting column. If you run out of chars in the column, you get an empty string for that line, even if the line originally continued after the column.
 
  
:; Whether to {{field|count the char range backwards}}.
+
Additional options are available to further modify, which characters are compared and which are ignored upon comparison.
:: Very similar to {{field|count columns backwards}} above.
+
* {{field|Sort Lines|Ignore case}} option can be used to compare lines in {{defined|case insensitive}} manner. This option is implicit upon {{field|Sort Lines|Sort as}} {{field|Sort Lines|Numbers}}.
:: Note, that even if a {{defined|char range}} is counted backwards, the comparing of two lines is performed normally, from left to right. If you wish to compare lines backward, try to use the Reverse Lines tool to reverse the lines before and after the sort. For details see chapter [[Replacing tools]].
+
* {{field|Sort Lines|Ignore white-spaces}} option can be used to ignore all {{defined|white-spaces}} upon comparison. This option is implicit upon {{field|Sort Lines|Sort as}} {{field|Sort Lines|Numbers}}.
 +
* {{field|Sort Lines|Ignore punctuation}} option can be used to ignore all {{defined|punctuation characters}} upon comparison. System locale settings and system unicode tables are used to identify {{defined|punctuation characters}}.
 +
* {{field|Sort Lines|Ignore blank characters}} option can be used to ignore all {{defined|blank characters}} upon comparison. System unicode tables are used to identify {{defined|blank characters}} &mdash; invisible spacing characters.
 +
* {{field|Sort Lines|Ignore control characters}} option can be used to ignore all {{defined|control characters}} upon comparison. System unicode tables are used to identify {{defined|control characters}} &mdash; non-printable characters with special meaning.
 +
* {{field|Sort Lines|Ignore all but alpha-numeric characters}} option can be used to ignore all but {{defined|alpha-numeric characters}} upon comparison. System unicode tables are used to identify {{defined|alpha-numeric characters}}, which include letters and symbols from many different languages.
 +
* {{field|Sort Lines|Ignore characters}} option can be used to manually specify individual characters to be ignored upon comparison. Note that characters from this option are always {{defined|case sensitive}}.
 +
* {{field|Sort Lines|Ignore all but}} option can be used to manually specify individual characters to be accepted upon comparison, while ignoring any other characters. Note that characters from this option are always {{defined|case sensitive}}.
  
; Sorting {{field|Direction}}<nowiki>:</nowiki> {{field|ascending}} or {{field|descending}}.  
+
If a character is to be ignored upon comparison, it is as if all occurrences of such character were deleted before the comparison. Note, however, that the characters are never physically removed from the lines &mdash; the resulting lines are not modified by this tool.
: The tool always uses the actual locale settings to decide the order of the lines.
 
  
; Whether to compare two lines {{field|as numbers}} or {{field|as text}}.
+
Note: Set of characters to be actually ignored upon comparison is a union of all ignored characters as specified by the options above. Note that even if {{field|Sort Lines|Ignore all but}} option specifies some character, {{field|Sort Lines|Ignore characters}} option can take it off the table.
: Note, that number {{string|112}} is greater than number {{string|12}} if sorted as numbers, but it is smaller if sorted as text.
 
:;Sorting numbers
 
:: While sorting as numbers, all spaces are automatically ignored. TED Notepad is able to sort numbers of any length. The numebers may contain one decimal point and may start with a minus sign.
 
:: The comparison of two numbers is halted, when a first non-allowed character is found. The allowed characters are {{string|0-9}}, {{string|a-z}}, one decimal point, a minus sign at the beginning and all {{defined|white-spaces}}.
 
:: The numbers may be in decimal, binary, hexadecimal or any other scale. The only important thing is, that both compared numbers must be in the same scale. This also means, that you cannot compare a line like {{string|13 pigs}} against a line like {{string|13 elephants}}, because the letters are valid numerals. In this case, {{string|13 pigs}} are greater that {{string|13 elephants}}.
 
::: <small>Actually, {{string|1 woman}} is always more than {{string|1 man}} no matter what... On the other hand, {{string|1$ woman}} and {{string|1$ man}} are equal when compared {{field|as numbers}}.</small>
 
  
; Whether to {{field|ignore case}} when comparing two lines.
+
See also {{feature|Sort Lines Ascending}}, {{feature|Sort Lines Descending}} and {{feature|Sort Lines}} tools.
: The {{defined|case}} is always {{defined|ignored}} when sorting {{field|as numbers}}.
 
  
; Whether to {{field|ignore specific characters}}, when comparing two lines.
+
See also {{feature|Shuffle Lines}} tool.
: If a character is to be ignored, it is completely stripped from both lines when they are being compared.
 
: Note: While comparing as lines, you can not specify, which characters are to be ignored.
 
: Note: The resulting lines are not modified, and the ignored characters are not stripped from the results.
 

Revision as of 19:53, 21 June 2011

This section is up to date for TED Notepad version 6.3.1.0.
Control page Control:feature:Sort Lines

Sort.. (Alt+Ctrl+S)

Sorts lines of the selection according to the given sorting keys.

Upto five sorting keys can be used, each sorting key represented by its own tab in the Sort Lines dialog: First, Second, Third, Then, and Finally. Sorting keys are used subsequently ony by one, i.e. if two lines are considered equal acording to the First key, then the Second key is used, etc. In other words, to have the Finally key harnessed/employed, all four preceeding sorting keys must yield indecisive comparison.

Note: Each sorting key offer a special Nothing option, which turns off the entire key. Such key is not used for comparing lines upon sorting, but it does not prevent later sorting keys from being queried. Using Nothing option has the same effect as setting up a sorting key, which would always yield indecisive comparison.

Note: Sorting keys are used only to compare and sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines themselves are not being modified, only their order.

Cutting line portion

Each sorting key may cut a portion of a line, which is used for comparison upon sorting.

Cutting portion of a line is divided into two successive parts. First, columns are cut from the line, by either:

Note: Columns are cut for each line separately, therefore the total number of columns on a line may vary from line to line. If specified column number is beyond the total number of columns, empty zero-length portion is cut for such column from that line. Therefore, it is allowed to cut columns 2-7, even though some lines do not have enough columns to offer.

After the first part (cutting columns from the line), the resulting portion of the line can be further cropped by turning Use only characters option on, and specifying a range of character positions between from position and to position. This cuts off everyting before the from position and everyting after the to position.

  • Calculate the position backwards: from right to left can be used to numerate the character positions from the end of line rather than the usual way. Note, however, that this only affects how the positions are numbered before they are cropped; the text itself is not reversed.

Note: Range of character positions is always cropped after columns are cut. If a range is specified beyond currently cut columns, empty zero-length portion is cropped from that line, even if the original line continues after the cut columns. In other words, range of character positions cannot bring back what has already been cut off by previous step.

Note: There is currently no way to cut columns after cropping a range of character positions.

Note: Column preview button displays a small portion of lines transformed by current dialog values. Be aware that preview is always generated from the current selection. If there is no selection, then there is nothing to preview.

Direction and format

The meaning of each sorting key is specified by Sort as options.

  • Text option tells that each portion of line previously cut is to be treated as a simple string of characters upon comparison — comparison of two strings of characters is done lexicographically, i.e. by subsequently comparing each pair of corresponding characters from both strings (the first pair of different characters is the one that decides which string is greater).
  • Numbers option tells that each portion of line previously cut is to be treated as a number upon comparison — comparison of two numbers is done by their natural order.
    • A number may, in general, consist of a sign, numerals, decimal point and more numerals. Any of those may be omitted, e.g. 3.14, +2.0, 2., -7, .3 are all valid numbers. Total count of digits in a number is not limited. Note that individual separate decimal point or positive/negative sign character with no numerals, as well as empty number, are all treated as 0.
    • Note: System locale settings (Regional settings) are used while comparing numbers. This includes decimal point, positive and negative signs, thousands separator, etc.
    • Note: All alpha-numeric characters are considered valid numerals, thus hexadecimal numbers (or any other system with radix upto 36) are supported. All numbers must share the same radix, however.
    • Note: All white-spaces are automatically ignored upon comparing numbers. Character case is ignored as well.
    • Note: The first unrecognized and non-ignored character after a recognizable number terminates that number. Any following text is ignored. Note, however, that all alpha-numeric characters are considered valid numerals and white-spaces are ignored, thus mystifying 3 pigs is a valid number, which is bigger than 3 cows, but is far far less than 1 elephant. This is because both 3pigs and 3cows have only 5 digits while 1elephant has 9 digits. It is the same as if comparing numbers 37198, 32048 and 131375429.
    • See chapter Appendix for further details and examples of recognized number formats.

Each sorting key can specify the resulting order of sorted lines using the Direction with two intuitive options: Ascending or Descending.

options

Additional options are available to further modify, which characters are compared and which are ignored upon comparison.

If a character is to be ignored upon comparison, it is as if all occurrences of such character were deleted before the comparison. Note, however, that the characters are never physically removed from the lines — the resulting lines are not modified by this tool.

Note: Set of characters to be actually ignored upon comparison is a union of all ignored characters as specified by the options above. Note that even if Ignore all but option specifies some character, Ignore characters option can take it off the table.

See also Sort Lines Ascending, Sort Lines Descending and Sort Lines tools.

See also Shuffle Lines tool.