Difference between revisions of "Sort tools"
(3 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | <noinclude>{{manversion|6.0.0. | + | <noinclude>{{manversion|6.0.0.17|menu}}__NOTOC__</noinclude> |
− | {{:Sort Ascending}} | + | {{:Sort Lines Ascending}} |
− | {{:Sort Descending}} | + | {{:Sort Lines Descending}} |
− | {{:Sort}} | + | {{:Sort Lines}} |
<!-- ++++++++ --> | <!-- ++++++++ --> | ||
− | {{:Shuffle}} | + | {{:Shuffle Lines}} |
Latest revision as of 08:32, 29 June 2011
Ascending (Alt+Ctrl+A)
Sorts lines
of the selection in ascending order, comparing the entire lines as simple text.
Does not ignore case
or any characters upon comparison.
See also Sort Lines Descending and Sort Lines tools.
See also Shuffle Lines tool.
Descending (Alt+Ctrl+Z)
Sorts lines
of the selection in descending order, comparing the entire lines as simple text.
Does not ignore case
or any characters upon comparison.
See also Sort Lines Ascending and Sort Lines tools.
See also Shuffle Lines tool.
Sort.. (Alt+Ctrl+S)
Sorts lines
of the selection according to the given sorting keys.
Up to five sorting keys can be used, each sorting key represented by its own tab in the Sort Lines dialog: First, Second, Third, Then, and Finally. Sorting keys are used subsequently ony by one, i.e. if two lines are considered equal acording to the First key, then the Second key is used, etc. In other words, to have the Finally key harnessed/employed, all four preceeding sorting keys must yield indecisive comparison.
Note: Each sorting key offer a special Nothing option, which turns off the entire key. Such key is not used for comparing lines upon sorting, but it does not prevent later sorting keys from being queried. Using Nothing option has the same effect as setting up a sorting key, which would always yield indecisive comparison.
Note: Sorting keys are used only to compare and sort the lines, e.g. to decide, which one goes first and which one will be the second one. The lines themselves are not being modified, only their order.
Cutting line portion
Each sorting key may cut portions from lines
, which are to be used for comparison upon sorting.
Cutting portion of a line
is divided into two successive parts. First, columns
are cut from the line
, by either:
- Entire line, which keeps the entire original
line
. - Columns, which allows to specify one or more subsequent
columns
between columns from and columns to (inclusive). Individualcolumns
are separated by characters from Delimiting characters, which means that theline
is scanned for thesedelimiting characters
, and wherever adelimiting character
is found (any character from Delimiting characters), oldcolumn
ends before thisdelimiter
and a newcolumn
begins after thisdelimiter
. There can be as manydelimiting characters
as necessary and all of them delimitcolumns
equally and indiscriminately.- Optionally, Delimit by entire phrase rather than separate characters can be used to stop treating Delimiting characters as a set of individual characters, and scan for an entire
delimiting phrase
instead. - Optionally, Treat any sequence of delimiters as single delimiter can be used to count several successive
delimiters
as a singledelimiter
. This can be useful, for example, if inputcolumns
are aligned by spaces into neat visual columns, where each twocolumns
are delimited by an arbitrary-length sequence of spaces. Such sequences need to be treated as indivisible columndelimiters
.- Note that even different delimiting characters are treated as a single indivisible
delimiter
, if found in sequence.
- Note that even different delimiting characters are treated as a single indivisible
- Optionally, Calculate columns backwards: from right to left can be used to numerate the
columns
from the end of line rather than the usual way. Note, however, that this only affects how thecolumns
are numbered before they are cut; the text of thecolumns
is not reversed. - Note:
Delimiters
are not included within thecolumn
being cut. However, if two or more subsequentcolumns
are cut together, innerdelimiters
are not removed and become part of the line portion being cut. - Note: Delimiting characters are always
case sensitive
. - Examples: A line HELLO WORLD would be divided into:
- two columns (HELLO and WORLD) — if delimited by a Space character;
- four columns (H, LL, W and RLD) — if delimited by a set of EO characters;
- four columns (HE, {empty}, O WOR and D) — if delimited by an L character;
- but only three columns (HE, O WOR and D) — if Treat any sequence of delimiters as single delimiter option is turned on;
- and only one column (HELLO WORLD) — if delimited by an X character,
lower case
e character, or if delimited by no characters — this is because none of these characters are found on the examined line.
- Optionally, Delimit by entire phrase rather than separate characters can be used to stop treating Delimiting characters as a set of individual characters, and scan for an entire
Note: Columns
are cut for each line separately, therefore the total number of columns
on a line may vary from line to line. If specified column
number is beyond the total number of columns
, empty zero-length portion is cut for such column
from that line. Therefore, it is allowed to cut columns
2-7, even though some lines do not have enough columns
to offer.
After the first part (cutting columns
from the line), the resulting portion of the line can be further cropped by turning Use only characters option on, and specifying a range of character positions between from position and to position. This cuts off everyting before the from position and everyting after the to position.
- Calculate the position backwards: from right to left can be used to numerate the character positions from the end of line rather than the usual way. Note, however, that this only affects how the positions are numbered before they are cropped; the text itself is not reversed.
Note: Range of character positions is always cropped after columns
are cut. If a range is specified beyond currently cut columns
, empty zero-length portion is cropped from that line, even if the original line continues after the cut columns
. In other words, range of character positions cannot bring back what has already been cut off by previous step.
Note: There is currently no way to cut columns
after cropping a range of character positions.
Note: Column preview button displays a small portion of lines transformed by current dialog values. Be aware that preview is always generated from the current selection. If there is no selection, then there is nothing to preview.
Direction and format
The meaning of each sorting key is specified by Sort as options.
- Text option tells that each portion of line previously cut is to be treated as a simple string of characters upon comparison — comparison of two strings of characters is done
lexicographically
, i.e. by subsequently comparing each pair of corresponding characters from both strings (the first pair of different characters is the one that decides which string is greater).- Note:
Character case
can be ignored uponlexicographical
comparison, as well as entire characters and character types. See below.
- Note:
- Numbers option tells that each portion of line previously cut is to be treated as a
number
upon comparison — comparison of two numbers is done by their natural order.- A
number
may, in general, consist of a sign, numerals, decimal point and more numerals. Any of those may be omitted, e.g. 3.14, +2.0, 2., -7, .3 are all valid numbers. Total count of digits in anumber
is not limited. Note: Individual separate decimal point (.) or solitary positive/negative sign character (+ or -) with no numerals following, as well as empty number (empty string), are all treated as zero. - Note: System locale settings (Regional settings) are used while comparing numbers. This includes localization of decimal point, positive and negative signs, thousands separator, etc.
- Note: All alpha-numeric characters are considered valid numerals, thus hexadecimal numbers (or any other system with radix up to 36) are supported. All numbers must share the same radix, however.
- Note: All
white-spaces
are automatically ignored upon comparingnumbers
, andcharacter case
is ignored as well. - Note: The first unrecognized and non-ignored character after a recognizable
number
terminates thatnumber
. Any following text is ignored. Also remember that all alpha-numeric characters are considered valid numerals and allwhite-spaces
are ignored, thus perhaps a bit mystifying, 3 pigs is a valid number, which is bigger than 3 cows, but is far far less than 1 elephant. This is because both 3pigs and 3cows have 5 digits while 1elephant has 9 digits. It is similar as if comparing 37198, 32048 and 131375429. Tricky, isn't it? - See chapter Appendix for further details and examples of recognized
number formats
.
- A
Each sorting key can specify the resulting order of sorted lines using the Direction with two intuitive options: Ascending or Descending.
Options
Additional options are available to further modify, which characters are compared and which are ignored upon comparison.
- Ignore case option can be used to compare lines in
case insensitive
manner. This option is implicit upon Sort as Numbers. - Ignore white-spaces option can be used to ignore all
white-spaces
upon comparison. This option is implicit upon Sort as Numbers. - Ignore punctuation option can be used to ignore all
punctuation characters
upon comparison. System locale settings and system unicode tables are used to identifypunctuation characters
. - Ignore blank characters option can be used to ignore all
blank characters
upon comparison. System unicode tables are used to identifyblank characters
— invisible spacing characters. - Ignore control characters option can be used to ignore all
control characters
upon comparison. System unicode tables are used to identifycontrol characters
— non-printable characters with special meaning. - Ignore all but alpha-numeric characters option can be used to ignore all but
alpha-numeric characters
upon comparison. System unicode tables are used to identifyalpha-numeric characters
, which include letters and symbols from many different languages. - Ignore characters option can be used to manually specify individual characters to be ignored upon comparison. Note that characters from this option are always
case sensitive
. - Ignore all but option can be used to manually specify individual characters to be accepted upon comparison, while ignoring any other characters. Note that characters from this option are always
case sensitive
.
If a character is to be ignored upon comparison, it is as if all occurrences of such character were deleted before the comparison. Note, however, that the characters are never physically removed from the lines — the resulting lines are not modified by this tool.
Note: Set of characters to be actually ignored upon comparison is a union of all ignored characters as specified by the options above. Note that even if Ignore all but option specifies some character, Ignore characters option can take it off the table.
See also Sort Lines Ascending, Sort Lines Descending and Sort Lines tools.
See also Shuffle Lines tool.
Shuffle (Ctrl+Shift+K)
Reorders lines
of the selection randomly.
Warning: Although the tool uses a more complex pseudo-random generator (than a linear congruential generator), it is not safe to assume that the generator is cryptographically secure. This tool should not be used for encryption purposes if real sturdy protection is expected, since the pseudo-random generator is not constructed to withstand real cryptanalysis. It is only safe to assume that the generator is not easily predictable without advanced cryptanalysis — which is acceptable for most common tasks.
See also Sort Lines Ascending, Sort Lines Descending and Sort Lines tools.