This document describes ChaiScript, an open-source (BSD license) scripting language invented by Jason Turner. It was written by examining the ChaiScript source code and searching the internet for examples and explanations. Any inaccuracies are the fault of the author.
General nature of the language
ChaiScript is a strongly typed declarative language with syntax similar to C++. Important differences from C++ are:
- All statements are legal at file scope, outside any function.
- Variables have no type until they are first given a value, either at the time of declaration or later.
- Flow-of-control constructs like if, for and while take blocks enclosed in { … }; they cannot take single statements.
- There are no postincrement or postdecrement operators (++ and -- after the variable name).
Character set
ChaiScript uses 7-bit ASCII characters only. The ChaiScript parser assumes the ASCII encoding; at one point it checks for character codes greater than 0x7E.
Tokens
Identifiers are case-sensitive sequences of characters starting with a letter (a…z, A…Z) or an underscore (_), and continuing with zero or more letters, underscores and digits (0…9). The format is identical to that of C++ and C.
The following keywords are recognised:
attr auto break case catch class continue def default else finally for global if return switch try var while
There are some special identifiers which are not strictly keywords but have reserved meanings:
_ __CLASS__ Infinity false __FILE__ __FUNC__ __LINE__ Nan true
The following punctuation tokens are recognised:
:: ( ) : { } [ ] . ,
The following operator tokens are recognised:
? || && | ^ & == != < <= > >= << >> + - * / % ++ -- ! ~
Line breaks are marked by either a single linefeed or a carriage return followed by a linefeed.
Integer literals are as in C++. Decimal binary, octal and hexadecimal integer literals are all supported.
Floating-point number literals are as in C++, but C++17 hexadecimal floating-point literals are not supported.
String literals are sequences of characters enclosed in double quotes. Double quotes can be put in a string using the sequence \”. String literals may contain interpolations, which are enclosed in ${ … } and may contain unescaped double quotes. Interpolations may be nested. They are used as code literals and interpreted as ChaiScript code by the built-in print function.
Operator literals are operators enclosed in back ticks, like `+`. They are function objects referring to operators.
Comments are as in C++. That is, a comment lasting to the end of the current line starts with //, and a multi-line comment is of the form /* … */.
Annotations are special single-line comments starting with #. A function’s annotation can be retrieved programmatically.
Statements and declarations
Statements and declarations are terminated either by the end of a line or by a semicolon. Multi-line statements are parsed greedily; if the parser has not encountered a semicolon or other delimiter indicating the end of a statement when it reaches the end of a line, it moves on to the next line. Unlike in C++, semicolons are not mandatory.
Programs
A ChaiScript program is a sequence of statements and declarations. Unlike in C++, statements may exist outside any function. A program to print ‘Hello World’ can therefore be expressed as the one-line program
puts(“Hello World”)
Built-in types
The following types are built in, and have constructors (e.g., uint32_t(), int(45), etc.) with the same name:
Boolean types
bool
integer types
int long unsigned_int unsigned_long long_long unsigned_long_long size_t int8_t int16_t int32_t int64_t uint8_t uint16_t uint32_t uint64_t
floating point types
double long_double float
character types
char wchar_t char16_t char32_t
Standard library types
These types are provided by the standard library, which is automatically loaded by the default ChaiScript constructor when ChaiScript is compiled.
string types
string: very similar to C++’s std::string
aggregate types
Vector
Map
Pair
other types
future
Creating variables: auto and var
Variables are declared using auto, or var, which is a synonym of auto. The syntax is:
auto <name>
or
auto <name> = <expression>
and the type of the variable is deduced from that of the expression if any. If no expression is given, the type is determined the first time the variable is assigned.
Examples:
auto n = 3
var f = 56.78
auto s = “abc”
auto u = unsigned_int() // sets u to 0