STRING
signature
signature STRING
structure String
:> STRING
where type string = string
where type string = CharVector.vector
where type char = Char.char
structure WideString
:> STRING (* OPTIONAL *)
where type string = WideCharVector.vector
where type char = WideChar.char
The STRING
signature specifies the basic operations on a string type, which is a vector of the underlying character type char
as defined in the structure.
The STRING
signature is matched by two structures, the required String
and the optional WideString
. The former implements strings based on the extended ASCII 8-bit characters, and is a companion structure to the Char
structure. The latter provides strings of characters of some size greater than or equal to 8 bits, and is related to the structure WideChar
. In particular, the type String.char
is identical to the type Char.char
and, when WideString
is defined, the type WideString.char
is identical to the type WideChar.char
. These connections are made explicit in the Text
and WideText
structures, which match the TEXT
signature.
eqtype string
eqtype char
val maxSize : int
val size : string -> int
val sub : string * int -> char
val extract : string * int * int option -> string
val substring : string * int * int -> string
val ^ : string * string -> string
val concat : string list -> string
val concatWith : string -> string list -> string
val str : char -> string
val implode : char list -> string
val explode : string -> char list
val map : (char -> char) -> string -> string
val translate : (char -> string) -> string -> string
val tokens : (char -> bool) -> string -> string list
val fields : (char -> bool) -> string -> string list
val isPrefix : string -> string -> bool
val isSubstring : string -> string -> bool
val isSuffix : string -> string -> bool
val compare : string * string -> order
val collate : (char * char -> order)
-> string * string -> order
val < : string * string -> bool
val <= : string * string -> bool
val > : string * string -> bool
val >= : string * string -> bool
val toString : string -> String.string
val scan : (char, 'a) StringCvt.reader
-> (string, 'a) StringCvt.reader
val fromString : String.string -> string option
val toCString : string -> String.string
val fromCString : String.string -> string option
val maxSize : int
size s
sub (s, i)
Subscript
if i < 0 or |s| <= i.
extract (s, i, NONE)
extract (s, i, SOME j)
substring (s, i, j)
Subscript
if i < 0
or |s| < i. The second form returns the substring of size j starting at index i, i.e., the string s[i..i+j-1]
. It raises Subscript
if i < 0 or j < 0 or |s| < i + j. Note that, if defined, extract
returns the empty string when i = |s|.
The third form returns the substring s[i..i+j-1]
, i.e., the substring of size j starting at index i. This is equivalent to
.
extract
(s, i, SOME
j)
Implementation note:
Implementations of these functions must perform bounds checking in such a way that the
Overflow
exception is not raised.
s ^ t
Size
if |s| + |t| > maxSize
.
concat l
Size
if the sum of all the sizes is greater than maxSize
.
concatWith s l
Size
if the size of the resulting string would be greater than maxSize
.
str c
implode l
concat
(List.map
str
l)
. This raises Size
if the resulting string would have size greater than maxSize
.
explode s
map f s
implode
(List.map
f (explode
s))
.
translate f s
concat
(List.map
f (explode
s))
.
tokens f s
fields f s
Two tokens may be separated by more than one delimiter, whereas two fields are separated by exactly one delimiter. For example, if the only delimiter is the character #"|"
, then the string "|abc||def"
contains two tokens "abc"
and "def"
, whereas it contains the four fields ""
, "abc"
, ""
and "def"
.
isPrefix s1 s2
isSubstring s1 s2
isSuffix s1 s2
true
if the string s1 is a prefix, substring, or suffix (respectively) of the string s2. Note that the empty string is a prefix, substring, and suffix of any string, and that a string is a prefix, substring, and suffix of itself.
compare (s, t)
Char.compare
on the characters. It returns LESS
, EQUAL
, or GREATER
, if s is less than, equal to, or greater than t, respectively.
collate f (s, t)
val < : string * string -> bool
val <= : string * string -> bool
val > : string * string -> bool
val >= : string * string -> bool
char
type.
toString s
scan getc strm
fromString s
isPrint
), or if they encounter an improper escape sequence. fromString
ignores the remaining characters, while scan
returns the remaining characters as the rest of the stream.
The function fromString
is equivalent to the
.
StringCvt.scanString
scan
If no conversion is possible, e.g., if the first character is non-printable or begins an illegal escape sequence, NONE
is returned. Note, however, that
returns fromString
""
.
SOME
("")
For more information on the allowed escape sequences, see the entry for CHAR.fromString
. SML source also allows escaped formatting sequences, which are ignored during conversion. The rule is that if any prefix of the input is successfully scanned, including an escaped formatting sequence, the functions returns some string. They only return NONE
in the case where the prefix of the input cannot be scanned at all. Here are some sample conversions:
Input string s |
fromString s
|
---|---|
"\\q"
|
NONE
|
"a\^D"
|
SOME "a"
|
"a\\ \\\\q"
|
SOME "a"
|
"\\ \\"
|
SOME ""
|
""
|
SOME ""
|
"\\ \\\^D"
|
SOME ""
|
"\\ a"
|
NONE
|
Implementation note:
Because of the special cases, such as
fromString "" = SOME ""
,fromString "\\ \\\^D" = SOME ""
, andfromString "\^D" = NONE
, the functions cannot be implemented as a simple iterative application ofCHAR.scan
.
toCString s
fromCString s
fromString
above, except that C escape sequences are used (see ISO C standard ISO/IEC 9899:1990[CITE]).
For more information on the allowed escape sequences, see the entry for CHAR.fromCString
. Note that fromCString
accepts an unescaped single quote character, but does not accept an unescaped double quote character.
CHAR
,CharArray
,CharVector
,StringCvt
,SUBSTRING
,TEXT
,WideCharVector
Generated April 12, 2004
Last Modified October 17, 2000
Comments to John Reppy.
This document may be distributed freely over the internet as long as the copyright notice and license terms below are prominently displayed within every machine-readable copy.
Copyright © 2004 AT&T and Lucent Technologies. All rights reserved.
Permission is granted for internet users to make one paper copy for their
own personal use. Further hardcopy reproduction is strictly prohibited.
Permission to distribute the HTML document electronically on any medium
other than the internet must be requested from the copyright holders by
contacting the editors.
Printed versions of the SML Basis Manual are available from Cambridge
University Press.
To order, please visit
www.cup.org (North America) or
www.cup.cam.ac.uk (outside North America). |