Difference between revisions of "Sort Lines"
(Created page with '<noinclude>{{manversion|5.3.0.2}}__NOTOC__</noinclude> ====Sort... (Alt+Ctrl+S)==== Shows {{dialog|Sort Selection}} dialog and asks for {{field|sorting keys}}. Then sorts {{defi…') |
|||
Line 1: | Line 1: | ||
− | <noinclude>{{manversion| | + | <noinclude>{{manversion|6.0.0.16|feature}}__NOTOC__</noinclude> |
− | ====Sort | + | ====Sort.. (Alt+Ctrl+S)==== |
− | + | Sorts {{defined|lines}} of the selection according to the given sorting keys. | |
− | + | Upto five sorting keys can be used, each sorting key represented by its own tab in the {{dialog|Sort Lines}} dialog: {{field|Sort Lines|First}}, {{field|Sort Lines|Second}}, {{field|Sort Lines|Third}}, {{field|Sort Lines|Then}}, and {{field|Sort Lines|Finally}}. Sorting keys are used subsequently ony by one, i.e. if two lines are considered equal acording to the {{field|Sort Lines|First}} key, then the {{field|Sort Lines|Second}} key is used, etc. In other words, to have the {{field|Sort Lines|Finally}} key harnessed/employed, all four preceeding sorting keys must yield indecisive comparison. | |
− | Note: | + | Note: Each sorting key offer a special {{field|Cutting line portion|Nothing}} option, which turns off the entire key. Such key is not used for comparing lines upon sorting, but it does not prevent later sorting keys from being queried. Using {{field|Cutting line portion|Nothing}} option has the same effect as setting up a sorting key, which would always yield indecisive comparison. |
− | + | Note: Sorting keys are used only to compare and sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines themselves are not being modified, only their order. | |
− | + | =====Cutting line portion===== | |
− | |||
− | |||
− | + | Each sorting key may cut a portion of a line, which is used for comparison upon sorting. {{:Cutting line portion}} | |
− | |||
− | |||
− | + | =====Direction and format===== | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | The meaning of each sorting key is specified by {{field|Sort Lines|Sort as}} options. | |
− | + | * {{field|Sort Lines|Text}} option tells that each portion of line previously cut is to be treated as a simple string of characters upon comparison — comparison of two strings of characters is done {{defined|lexicographically}}, i.e. by subsequently comparing each pair of corresponding characters from both strings (the first pair of different characters is the one that decides which string is greater). | |
− | + | ** Note: {{defined|Character case}} can be ignored upon {{defined|lexicographical}} comparison, as well as entire characters and character types. See below. | |
− | :: | + | * {{field|Sort Lines|Numbers}} option tells that each portion of line previously cut is to be treated as a {{defined|number}} upon comparison — comparison of two numbers is done by their natural order. |
+ | ** A {{defined|number}} may, in general, consist of a sign, numerals, decimal point and more numerals. Any of those may be omitted, e.g. {{string|3.14}}, {{string|+2.0}}, {{string|2.}}, {{string|-7}}, {{string|.3}} are all valid numbers. Total count of digits in a {{defined|number}} is not limited. Note that individual separate decimal point or positive/negative sign character with no numerals, as well as ''empty'' number, are all treated as 0. | ||
+ | ** Note: System locale settings (Regional settings) are used while comparing numbers. This includes decimal point, positive and negative signs, thousands separator, etc. | ||
+ | ** Note: All alpha-numeric characters are considered valid numerals, thus hexadecimal numbers (or any other system with radix upto 36) are supported. All numbers must share the same radix, however. | ||
+ | ** Note: All {{defined|white-spaces}} are automatically ignored upon comparing {{defined|numbers}}. {{defined|Character case}} is ignored as well. | ||
+ | ** Note: The first unrecognized and non-ignored character after a recognizable {{defined|number}} terminates that {{defined|number}}. Any following text is ignored. Note, however, that all alpha-numeric characters are considered valid numerals and {{defined|white-spaces}} are ignored, thus mystifying {{string|3 pigs}} is a valid number, which is bigger than {{string|3 cows}}, but is far far less than {{string|1 elephant}}. This is because both {{string|3pigs}} and {{string|3cows}} have only 5 digits while {{string|1elephant}} has 9 digits. It is the same as if comparing numbers {{string|37198}}, {{string|32048}} and {{string|131375429}}. | ||
+ | ** See chapter [[Appendix]] for further details and examples of recognized {{defined|number formats}}. | ||
− | : | + | Each sorting key can specify the resulting order of sorted lines using the {{field|Sort Lines|Direction}} with two intuitive options: {{field|Sort Lines|Ascending}} or {{field|Sort Lines|Descending}}. |
− | |||
− | + | ===== options===== | |
− | |||
− | + | Additional options are available to further modify, which characters are compared and which are ignored upon comparison. | |
− | + | * {{field|Sort Lines|Ignore case}} option can be used to compare lines in {{defined|case insensitive}} manner. This option is implicit upon {{field|Sort Lines|Sort as}} {{field|Sort Lines|Numbers}}. | |
− | + | * {{field|Sort Lines|Ignore white-spaces}} option can be used to ignore all {{defined|white-spaces}} upon comparison. This option is implicit upon {{field|Sort Lines|Sort as}} {{field|Sort Lines|Numbers}}. | |
+ | * {{field|Sort Lines|Ignore punctuation}} option can be used to ignore all {{defined|punctuation characters}} upon comparison. System locale settings and system unicode tables are used to identify {{defined|punctuation characters}}. | ||
+ | * {{field|Sort Lines|Ignore blank characters}} option can be used to ignore all {{defined|blank characters}} upon comparison. System unicode tables are used to identify {{defined|blank characters}} — invisible spacing characters. | ||
+ | * {{field|Sort Lines|Ignore control characters}} option can be used to ignore all {{defined|control characters}} upon comparison. System unicode tables are used to identify {{defined|control characters}} — non-printable characters with special meaning. | ||
+ | * {{field|Sort Lines|Ignore all but alpha-numeric characters}} option can be used to ignore all but {{defined|alpha-numeric characters}} upon comparison. System unicode tables are used to identify {{defined|alpha-numeric characters}}, which include letters and symbols from many different languages. | ||
+ | * {{field|Sort Lines|Ignore characters}} option can be used to manually specify individual characters to be ignored upon comparison. Note that characters from this option are always {{defined|case sensitive}}. | ||
+ | * {{field|Sort Lines|Ignore all but}} option can be used to manually specify individual characters to be accepted upon comparison, while ignoring any other characters. Note that characters from this option are always {{defined|case sensitive}}. | ||
− | + | If a character is to be ignored upon comparison, it is as if all occurrences of such character were deleted before the comparison. Note, however, that the characters are never physically removed from the lines — the resulting lines are not modified by this tool. | |
− | |||
− | + | Note: Set of characters to be actually ignored upon comparison is a union of all ignored characters as specified by the options above. Note that even if {{field|Sort Lines|Ignore all but}} option specifies some character, {{field|Sort Lines|Ignore characters}} option can take it off the table. | |
− | |||
− | : | ||
− | |||
− | |||
− | |||
− | |||
− | + | See also {{feature|Sort Lines Ascending}}, {{feature|Sort Lines Descending}} and {{feature|Sort Lines}} tools. | |
− | |||
− | + | See also {{feature|Shuffle Lines}} tool. | |
− | |||
− | |||
− |
Revision as of 19:53, 21 June 2011
Sort.. (Alt+Ctrl+S)
Sorts lines
of the selection according to the given sorting keys.
Upto five sorting keys can be used, each sorting key represented by its own tab in the Sort Lines dialog: First, Second, Third, Then, and Finally. Sorting keys are used subsequently ony by one, i.e. if two lines are considered equal acording to the First key, then the Second key is used, etc. In other words, to have the Finally key harnessed/employed, all four preceeding sorting keys must yield indecisive comparison.
Note: Each sorting key offer a special Nothing option, which turns off the entire key. Such key is not used for comparing lines upon sorting, but it does not prevent later sorting keys from being queried. Using Nothing option has the same effect as setting up a sorting key, which would always yield indecisive comparison.
Note: Sorting keys are used only to compare and sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines themselves are not being modified, only their order.
Cutting line portion
Each sorting key may cut a portion of a line, which is used for comparison upon sorting.
Cutting portion of a line
is divided into two successive parts. First, columns
are cut from the line
, by either:
- Entire line, which keeps the entire original
line
. - Columns, which allows to specify one or more subsequent
columns
between columns from and columns to (inclusive). Individualcolumns
are separated by characters from Delimiting characters, which means that theline
is scanned for thesedelimiting characters
, and wherever adelimiting character
is found (any character from Delimiting characters), oldcolumn
ends before thisdelimiter
and a newcolumn
begins after thisdelimiter
. There can be as manydelimiting characters
as necessary and all of them delimitcolumns
equally and indiscriminately.- Optionally, Delimit by entire phrase rather than separate characters can be used to stop treating Delimiting characters as a set of individual characters, and scan for an entire
delimiting phrase
instead. - Optionally, Treat any sequence of delimiters as single delimiter can be used to count several successive
delimiters
as a singledelimiter
. This can be useful, for example, if inputcolumns
are aligned by spaces into neat visual columns, where each twocolumns
are delimited by an arbitrary-length sequence of spaces. Such sequences need to be treated as indivisible columndelimiters
.- Note that even different delimiting characters are treated as a single indivisible
delimiter
, if found in sequence.
- Note that even different delimiting characters are treated as a single indivisible
- Optionally, Calculate columns backwards: from right to left can be used to numerate the
columns
from the end of line rather than the usual way. Note, however, that this only affects how thecolumns
are numbered before they are cut; the text of thecolumns
is not reversed. - Note:
Delimiters
are not included within thecolumn
being cut. However, if two or more subsequentcolumns
are cut together, innerdelimiters
are not removed and become part of the line portion being cut. - Note: Delimiting characters are always
case sensitive
. - Examples: A line HELLO WORLD would be divided into:
- two columns (HELLO and WORLD) — if delimited by a Space character;
- four columns (H, LL, W and RLD) — if delimited by a set of EO characters;
- four columns (HE, {empty}, O WOR and D) — if delimited by an L character;
- but only three columns (HE, O WOR and D) — if Treat any sequence of delimiters as single delimiter option is turned on;
- and only one column (HELLO WORLD) — if delimited by an X character,
lower case
e character, or if delimited by no characters — this is because none of these characters are found on the examined line.
- Optionally, Delimit by entire phrase rather than separate characters can be used to stop treating Delimiting characters as a set of individual characters, and scan for an entire
Note: Columns
are cut for each line separately, therefore the total number of columns
on a line may vary from line to line. If specified column
number is beyond the total number of columns
, empty zero-length portion is cut for such column
from that line. Therefore, it is allowed to cut columns
2-7, even though some lines do not have enough columns
to offer.
After the first part (cutting columns
from the line), the resulting portion of the line can be further cropped by turning Use only characters option on, and specifying a range of character positions between from position and to position. This cuts off everyting before the from position and everyting after the to position.
- Calculate the position backwards: from right to left can be used to numerate the character positions from the end of line rather than the usual way. Note, however, that this only affects how the positions are numbered before they are cropped; the text itself is not reversed.
Note: Range of character positions is always cropped after columns
are cut. If a range is specified beyond currently cut columns
, empty zero-length portion is cropped from that line, even if the original line continues after the cut columns
. In other words, range of character positions cannot bring back what has already been cut off by previous step.
Note: There is currently no way to cut columns
after cropping a range of character positions.
Note: Column preview button displays a small portion of lines transformed by current dialog values. Be aware that preview is always generated from the current selection. If there is no selection, then there is nothing to preview.
Direction and format
The meaning of each sorting key is specified by Sort as options.
- Text option tells that each portion of line previously cut is to be treated as a simple string of characters upon comparison — comparison of two strings of characters is done
lexicographically
, i.e. by subsequently comparing each pair of corresponding characters from both strings (the first pair of different characters is the one that decides which string is greater).- Note:
Character case
can be ignored uponlexicographical
comparison, as well as entire characters and character types. See below.
- Note:
- Numbers option tells that each portion of line previously cut is to be treated as a
number
upon comparison — comparison of two numbers is done by their natural order.- A
number
may, in general, consist of a sign, numerals, decimal point and more numerals. Any of those may be omitted, e.g. 3.14, +2.0, 2., -7, .3 are all valid numbers. Total count of digits in anumber
is not limited. Note that individual separate decimal point or positive/negative sign character with no numerals, as well as empty number, are all treated as 0. - Note: System locale settings (Regional settings) are used while comparing numbers. This includes decimal point, positive and negative signs, thousands separator, etc.
- Note: All alpha-numeric characters are considered valid numerals, thus hexadecimal numbers (or any other system with radix upto 36) are supported. All numbers must share the same radix, however.
- Note: All
white-spaces
are automatically ignored upon comparingnumbers
.Character case
is ignored as well. - Note: The first unrecognized and non-ignored character after a recognizable
number
terminates thatnumber
. Any following text is ignored. Note, however, that all alpha-numeric characters are considered valid numerals andwhite-spaces
are ignored, thus mystifying 3 pigs is a valid number, which is bigger than 3 cows, but is far far less than 1 elephant. This is because both 3pigs and 3cows have only 5 digits while 1elephant has 9 digits. It is the same as if comparing numbers 37198, 32048 and 131375429. - See chapter Appendix for further details and examples of recognized
number formats
.
- A
Each sorting key can specify the resulting order of sorted lines using the Direction with two intuitive options: Ascending or Descending.
options
Additional options are available to further modify, which characters are compared and which are ignored upon comparison.
- Ignore case option can be used to compare lines in
case insensitive
manner. This option is implicit upon Sort as Numbers. - Ignore white-spaces option can be used to ignore all
white-spaces
upon comparison. This option is implicit upon Sort as Numbers. - Ignore punctuation option can be used to ignore all
punctuation characters
upon comparison. System locale settings and system unicode tables are used to identifypunctuation characters
. - Ignore blank characters option can be used to ignore all
blank characters
upon comparison. System unicode tables are used to identifyblank characters
— invisible spacing characters. - Ignore control characters option can be used to ignore all
control characters
upon comparison. System unicode tables are used to identifycontrol characters
— non-printable characters with special meaning. - Ignore all but alpha-numeric characters option can be used to ignore all but
alpha-numeric characters
upon comparison. System unicode tables are used to identifyalpha-numeric characters
, which include letters and symbols from many different languages. - Ignore characters option can be used to manually specify individual characters to be ignored upon comparison. Note that characters from this option are always
case sensitive
. - Ignore all but option can be used to manually specify individual characters to be accepted upon comparison, while ignoring any other characters. Note that characters from this option are always
case sensitive
.
If a character is to be ignored upon comparison, it is as if all occurrences of such character were deleted before the comparison. Note, however, that the characters are never physically removed from the lines — the resulting lines are not modified by this tool.
Note: Set of characters to be actually ignored upon comparison is a union of all ignored characters as specified by the options above. Note that even if Ignore all but option specifies some character, Ignore characters option can take it off the table.
See also Sort Lines Ascending, Sort Lines Descending and Sort Lines tools.
See also Shuffle Lines tool.