L is based on simple ideas that can be built upon to create more powerful ones. Unlike Smalltalk, which strives to do the same, it is based on fairly familiar language constructs. One does not have to build their own IF construct, for example.
The most striking features of L are:
Note that although L makes heavy use of associative arrays (A.k.a. dictionaries), I intend them mostly as interface mechanisms and not data storage nor collection manipulation. I assume that a table-based API will handle most data needs instead. (I am still working on the table/collection API.) In this sense L is a lot like Smalltalk in that collection manipulation is via libraries and not dedicated language syntax. This keeps the language simpler and more generic.
That reflects one important general philosophy of L: provide the syntax mechanisms to handle collections, but don't make the collection handling part of the built-in syntax.
sub mysub(x) // endx version of routine if x %n> 3 // we will explain the "n" later out "it's greater than 3" y = 1 + 2 + 3 + 4 _ // continued line + 5 + 6 endIF endSub sub mysub(x) { // brace style version if x %n> 3 { out("it's greater than 3"); y = 1 + 2 + 3 + 4 // line is split + 5 + 6; } }The endx version uses the underline character as a line continuation character (borrowed from VB).
I have not yet determined how style preference is indicated. One approach is the have a "setting _braces" or "setting _endx" at the top to indicate style (endx being the default), or use auto-detection where the first character after the routine declaration or block determines the interpretation for that routine/block. The technique selected will depend on interpreter technology feasibility. It might also be that an implementor only chooses one style or the other based on audience. Perhaps it can even be a dynamic setting, accessible via the "sys" array.All routines in the brace syntax must have parenthesis. However, in the endx style, parenthesis are optional unless a value is returned (function).
mysub x, y + 23 // valid in endx style a = mysub(x, y + 23) // valid a = mysub x, y + 23 // invalidA function is distinguished from a subroutine simply by whether a value is returned via an equal sign. Even if a routine has no Return statement, one can still extract a value from it as if it was a function. In that case, an empty (zero length) string will be returned.
L allows named parameters as well as positional parameters. Positional parameters must come before any named parameters.
mysub(1, 2 _waiton _where j > 3 _orderby lname) // See "note on symbols" below for "#" explanation mysub(1, 2 #waiton #where j > 3 #orderby lname) mysub 1, 2 _waiton _where j > 3 _orderby lname mysub 1, 2 _waiton _where "j > 3" _orderby "lname" result = mysub(1, 2 _waiton _where j > 3 _orderby lname)The non-parenthesis style is only allowed for the endx style and only if there is no returned value. The underscore marks a parameter name. Any arguments associated with a parameter (such as "j > 3" here) come after the parameter name. Quotes are optional around the arguments, unless it would cause syntax confusion.
The arguments are always passed in as strings. (Extracting parameters will be discussed later). Thus, "lname" above is a string and not an explicit variable. (It could perhaps be interpreted as a variable in the called routine, but this is purely up to the called routine.)
This may sound messy at first, but by not allowing the "leaky assignments" found in many scripting languages, (such as x = y = z = a), most confusion is immediately eliminated. Also note that a comma after the "2" parameter in the above example is optional. I figured if I picked one or the other, then people would keep getting it confused. The interpreter would generally use the first underscore encountered to know where the named parameters start. Also note that the comparisons shown above are assumed to be passed to SQL, since they are not legal in L. Named parameters are assumed to mostly be used by frameworks, and not casual programmer routines since they take longer to learn how to write.In this case, "_waiton" has no arguments. It would serve as a single-word switch.
Named parameters make creating database-friendly and phrase-based syntax easier. One can emulate say XBase or Hypertalk syntax with this. It also provides "message-oriented" syntax, such as found in Smalltalk. Named arguments are very useful in cases where there may be many options, but only a few deviate from the default in practice.
Here are the primary symbols used for our examples, and only for examples:
_ Named parameter indicatorPerhaps the symbol choices should be specifiable in a configuration file, along with blocking style preference and others. This may allow it to be more easily embedded in different environments, languages, and protocols.$ Scope specifier (see note)
# Assignment expression repeater
@ Write-enabled parameters
% Comparitor marker
& String concatenation
Note that the dollar sign is also used for embedding variables into strings. However, since they are inside of strings, they may be considered in a different category. Also note that variable embedding is only supported in some routines. In our case, we are assuming that Out() and OutLn() translate them. The string concatenation character "&" perhaps should be also put on the auction block, however, nobody seemed to complain about that choice.
L has only one type. However, for familiarity purposes, you can pretend that it has 2 types: strings and associative arrays. (See Misc. Notes about the one type.) This may seem restrictive at first, but actually it is liberating. First, let's look at uses for strings.
What "type" something is depends completely on the context that you use it in. It is somewhat like Perl in that regard. This approach requires that some operators be clear about how they will act.
a = "5" b = 2 out a + b // result: 7 out a & b // result: 52Unlike some languages, L uses a different operator for arithmetic addition and string concatenation. This is because L has no concept of built-in operator "overloading". (The closest thing is the canNum() function, but even this looks at the content itself to make a decision. If you like strong typing, L is not for you.)
Comparisons are done by putting a "comparative operator" next to the relationship symbol.
if a %t= b // compare as text if a %n= b // compare as numeric if a %g= b // general compare (see below) if a %tx= b // no trimming before comparing if a %ta= b // trim all white spaces before comparing if a %u= b // compare as Unicode if a %tb= b // ignore all white space, even between if a %tc= b // case sensitive compare if a %i= b // compare as integer (round at 0.5) if a %i> b // greater than example with integersNote that "!=" and "<>" can be used for "not equal". This is so that other languages won't through one off.
All comparison operators need at least one indicator letter. Thus, if you accidentally use a non-L habit, you will get a syntax error. Also note that the default for text comparing is right trimming.
The "g" operator if for "general" comparing. It can be used for both text and strings. It acts as if it trims both sides, ignores case, and removes any leading zeros and any trailing decimal zeros (if any) before comparing. This is the default for case statements, by the way. (I have been debating whether to use an implied "g" if there is no comparison operator. I am on the fence with that one.)
L uses the multi-emulation method to cover multiple "array" dimensions.
var myarray[] // declare an array myarray[1] = 'X' myarray.1 = 'X' // same as prior myarray[1,1] = 'X' myarray["1,1"] = 'X' // same as prior myarray["foo"] = 'X' myarray.foo = 'X' // same as priorThe alternative dot notation can be used to provide not only table-oriented syntax, but also object-oriented syntax. Note that if the dot notation is used, then only letters, numbers, and underscores can be used after the dot. If you wish to use spaces and funky punctuation, then use brackets and quotes instead.
L has something known as "hidden entries" (HE) in associative arrays. There are two kinds of hidden entries: reserved and user-defined.
myarray["~sys_lang_numbers"] = 'yes' // reserved slset(myarray,"numbers","yes") // same as prior myarray["~sys_customx"] = "foo" // user-defined set(myarray,"customx",'foo') // same as priorUser defined HE's can be created and removed by the programmer. Reserved HE's are created by L and cannot be removed. HE's do not show up in regular array scans, such as "for each" constructs. (This can be changed by changing the "showall" reserved HE setting.)
The reserved "trimming" setting, for example, causes L to remove leading zeros and/or spaces from the array key before inserting it or searching for it. The possible values are:
0 - No trimming (default)
1 - Rid any white space
2 - Rid white space and and trim leading zeros
(May be deprecated, see below)
Setting 2 would translate a key of "001" to "1". Commas are also looked at, so that "01, 01, 001" becomes "1,1,1". Spaces are also removed. Setting 2 is ideal for using numeric subscripts. (Remember that L values are not typed. Therefore, the 2 setting is only a trimming behavior. Strings can still be used.)
Thus, to set up an array for primarily numeric use:
var foo[] slset(foo, "trimming", 2) numericate(foo) // same as priorThe Numericate() function is a handy shortcut.
I am thinking about using the letter comparitor operaters instead of the above numbers. It seems numbers are reinventing the wheel covered by the comparitor letters.
Another reserved HE is the "parent" setting. This will cause L to look up any missing keys in the array named in the "parent" key. ("~sys_lang_parent" is the full name.) This can be used for OO inheritance, as we will see later.
Reserved HE "onDone" ("~sys_lang_onDone") is used to execute a snippet of code when the array falls out of scope, or program end, whichever is first. This allows utilities written in L to automatically perform cleanup, such as closing a table. (More on table API's is presented below.)
Reserved HE "onChange" ("~sys_lang_onChange") is used to trigger custom operations when any value in an array is changed. For example, it can be used in a database API to determine whether or not the current record needs to be saved. (It's sister HE is "onRead".) Note that just because it is a "reserved" HE does not mean that it cannot be programmer-assigned. Being reserved only means that you cannot delete it.
The user defined HE's can be used for whatever purpose the programmer can envision.
sub myroutine(x) while x %n< 7 x = x * 2 + 1 endwhile for each a in b[] out a, b endfor for i = 1 to 7 step 1 x = i - y end for select x case 1 foo() case 2,3,4 bar() otherwise foo() end select // case comparisons are based on // left-zero-trimmed and space-trimmed text. // If this is problematic for a situation, // then if-elseif-else can be used instead. endsub //------ Now, the curly braces version sub myroutine(x) { while x %n< 7 { x = x * 2 + 1; } for each a in b[] { out(a,b); } for i = 1 to 7 step 1 { x = i - y; } for(i=1; i < 7; i++) { // C-style for accepted x = i - y; } select x { case 1 { foo(); } case 2,3,4 { bar(); } otherwise { foo(); } } }Note that optional spaces ("endif" and "end if") can be used for all endx enders for those used to both styles. "Then" is also optional after IF statements because it is so common in other languages.
"Exit for" and "exit while" allow one to exit the corresponding loop. The exiting applies to the innermost loop if there is a potential overlap.
if x = null then ... // the old way if x t= "(null)" then ... // the new wayNote that this is only a convention suggestion since no built-in L functions return a null equivalent. However, database API's, for example, may have to translate them.
global x = 4 sub routine8 var x = 50 routine9() end sub sub routine9 var x = 2 out x // result: 2 out my$x // result: 2 out global$x // result: 4 endSubAny statements not put inside a routine are assumed to belong to the routine "main()". If an explicit Main is defined, then no statements, other than Setting statements or interpreter directives, are allowed outside of routines. Variables defined in main() are not automatically global, unlike many other scripting languages.
L also supports "regional" variables. This means that a variable has file-level scope, or "region" scope if inside of a defined region.
regional x x = 30 sub foo() var x = 2 out x // result: 2 out regional$x // result: 30 end sub // the "regional" qualifier would not be // needed if there was not a local x.If a routine name in more than one file, then use filename$routine_name to call distant routines. (File name excludes extension and does not recognize spaces.)
At the top of the file (near Setting clauses), other program files can be "recognized" using syntax like the following:
recognize mycode.LThis is roughly analogous to Java's "import" keyword.
If there is a routine naming overlap, the local routine gets precedence. If there is no local routine of a given name, but it is defined in multiple files, then an error is raised. At this point, you may need to use the file name scope specifiers for such routines.
Region names can also be declared to create package-like scope:
region foo // a bunch of statements and routines here end region ... foo$aroutine() // "foo$" may be optionalRegion names have precedence over file names.
L supports the inherit option to allow access to parent's scope. The "my$" scope specifier (see above) is more important if Inherit is used in order to distinguish between local and inherited variables. Without it, precedence goes to the caller. (If it did not, then we could not guarantee access to all scope ranges under the condition of name overlaps.)
I chose not to use a scope-specifier for parent variables so that routines can be split without adding specifiers to all the working variables. That is a pain.Inherit can apply to multiple levels, but must be in all relevant levels.
sub a var x = 4 b() endSub sub b inherit c() endSub sub c inherit out x // will print 4 endSubIf sub B did not have "inherit" (or C for that matter), then X would be undefined in C. Basically "inherit" says, "I can see any variable that my caller can".
x = "cat" foo "dog", x, _yaddle _daddle 12 ... sub foo(za, @ma) ma = 4 // we changed it for caller for each i in args[] outln i & ": " & args[i] // could also: outln "$i: $args[i]" endFor endSub Typical output: 1: dog 2: cat yaddle: daddle: 12As you can see, positional parameters are also included in the Args array under numeric keys based on position. Also note that even though "ma" was changed, the values in Args stay the same (unless explicitly changed in the array).
All named parameters (which start with underscores here) are returned as keys in the "args" array, and any arguments to the named parameters as the values under the keys. You can use the Eval() function to evaluate them. You may want to use the Inherit keyword so that the parent's variables can be examined if variables are passed. Or, you can use the "caller$" scope prefix.
Note that the "arg" array will not contain another array if a positional parameter is an array, but only the "~sys_lang_value" entry. Remember, nested arrays are against L's philosophy.
The "@" symbol indicates that the value is changable (passed by reference). Parameters without "@" are read-only (roughly equivalent to pass-by-value).
s = "select * from prices" t = openDB( _sql s _readonly) while getNext(t) // display price of all items out "Price of item " & t.descript outln " is " & t.price endWhile setFilter t, _where price < 15.00 ... close(t)Alternatively, the While line could also look like this:
while t.getnext()This is actually equivalent to:
while eval(t.getnext)For a more formal table or collection manipulation package, the functional approach, getnext(t), is probably preferable because Hidden Entries can be used to avoid overlaps with table field names. The package builder can put the code under a "~sys_getnext" entry instead of "getnext". The package designer can create and use as many ~sys_... entries as needed. For example, Getnext may need to know not to fetch the next record on the first execution. An HE flag can take care of this.
Note that usage of function syntax for table handle operations and dot syntax for fields is the reverse of the Visual Basic Script (VBS) convention. In VBS you would say something like t("price") instead of t.price and t.getnext() instead of getnext(t). The L approach is preferable because fields are accessed more often then handle operations in most programs. It is rumored that VBS may eventually use "!" to distinguish them instead of functional syntax, as it does in VB. Perhaps L can use ! to indicate a hidden entry in future editions.Note that our example array, t, can hold only one row at a time. Collection traversal operations, such as getnext(), are used to move among different records. Where the data for each record comes from is purely up to the DB interface package designer. (Provisions for hooking into C or C++ routines should also be provided, but will not be discussed here yet.)
Set-based DB engines (for SQL) would probably get all selected records at once and buffer them, while cursor-oriented approaches (like XBase) would probably retrieve them from the DB as needed (with possible sub-buffering for speed). Note that the syntax for each approach would not change much except perhaps for DB driver selections on opening. This allows one to swap drivers without major rewrites of record handling application code.
The "onDone" HE reserved setting (described previously) can be useful for automatically closing a table when a handle falls out of scope. The previously described "onChange" HE could also be helpful for database engine implementation.
Note that I am not a fan of OO. However, I am allowing L to cater to OO in the hopes that it may help "sell" it, and allow the rebirth of table-oriented syntax and all the rest of the good things that L provides. Thus, like Steve Jobs' Apple deal with Microsoft, compromise is sometimes needed for survival.One can view an OO class as simply a C-like struc (structure) with possible behavior attached to the structure members, along with the addition of inheritance.
In L, the equivalent of the structure is the associative array. Behavior is given to the elements by simply using the parenthesis syntax:
// Example of indirection var foo[] foo.x = "outln 'Hello L World!'" outln foo.x // result: outln 'Hello L World!' foo.x() // result: Hello L World! eval(foo.x) // equiv. to priorSee the difference? L makes extensive use of this kind of indirection (string evaluation).
Although being able to execute a value and visa verse may seem risky to some, this is a scripting language, where protection is not the primary concern. L in general uses context of usage rather than type stamping to determine what can be done to what. The benefit for sacrificing protection is added run-time flexibility.
These are already part of L. L's OO syntax simply borrows them. However, there are two features of L that were added for OO purposes (but could be used for other purposes).
The first is the class structure:
class foo this.x = 23.4 this.y = -12.9 method aMethod yaz(this.x) raz(this.y) endMethod end classThis is equivalent to:
global foo[] foo.x = 23.4 foo.y = -12.9 foo.aMethod = "yaz(this.x) \n raz(this.y)"The only difference is that the Method block does a syntax check upon script load. The check would be equivalent to:
canEval(foo.aMethod)The this alias simply references the name of the most recently referenced array. Doing any operation on "foo.aMethod" would make "foo" the most recently referenced array. Note that the function recent() will return the most recently referenced array in case one needs to save the name for deferred processing in a framework.
One drawback of this approach is that parameters cannot be directly passed to methods. One must set class values to pass parameters.
mail.to = "bob@host.com" mail.from = "spammer@spammakers.com" mail.title = "You too can be rich!" mail.send()The final OO-influenced feature is the ~sys_lang_parent Hidden Element that was already mentioned.
class tiger slset(this, "parent", "feline") ... end classThe HE ~sys_lang_parent simply tells the interpreter to look at an alternative (parent) array if the element was not found in the given array. The chain of parents can be multiple levels. (Although this feature was added for OO purposes, you are welcome to use it for other things besides OO.)
To "instantiate" a new object, use the Clone() function.
Although this has storage efficiency problems at large quantities because the code is duplicated for each instance, it is assumed that some sort of collection system or database is used to store large quantities of instance information, and not "objects." Note that the inheritance feature (parent HE) can sometimes be used alleviate this problem because it provides a path back to a single instance. L's instantiation-by-copy drawbacks somewhat resemble those of JavaScript.
Note that the "Inherit" keyword has nothing to do with OO.
setting _braces _foo off _bar _explicit
x = "Price of item $item is $thingy.price until Monday" Is the same as x = "Price of item " & item & " is " & thingy.price _ & " until Monday"Parenthesis are used to clarify boundaries if needed:
x = "Iliketosquish$(ourstuff)together." Is the same as x = "Iliketosquish" & ourstuff & "together."The VarParse() function can be used to translate embedded variables for those routines that don't automatically support it.
if x %n> y do foo() // is equivalent to if x %n> y foo() end if"Do" does not apply to the braces style.
out 1, 8 // result: 1 8This makes debugging a little simpler.
HTML <table border=1> <tr><td> Hey hey hey! </td></tr> </table> endHTMLOf course, out() would also output as CGI/HTML output in a web context.
if foo() var x = 3 blah() var y[] blah() end if // x and y[] are no longer available here
foobar = replace(foobar, "mary", "marry") foobar = replace(#, "mary", "marry") // same as above x = x + 5 x = # + 5 // same as prior
for i = 1 to x step 1 blah() end forIf x is less than 1, then the loop will not be entered. Similarly, in for i = 10 to x step -1, if x is greater than 10, then the loop will not be entered.
myvar = 5 // is the same as: myvar.~sys_lang_value = 5In such a setup a typical variable assignment would be nothing more than a shortcut for the "value" Hidden Entry. For efficiency purposes, the interpreter can perhaps assume a simple variable until other assignments or references imply an array. However, this is an implementation issue and not a syntax issue.
This also suggests an approach to have a strong-typing option in the future: there could be a "~sys_lang_type" hidden entry. I don't recommend it at this stage, but it shows how flexible L can be.
if x %n> 3 foo end ifI am open to suggestions. Perhaps just keep it the way it was, but make the parenthesis mandatory:
if x (n>) 3Perhaps curly braces could be used instead:
if x {n>} 3For now, I favor the "%" style.
canEval() - returns true if no syntax errors. Does not check the existence of routines nor variables. canNum() - Boolean, if an expression can evaluate to a number. case clone() - Copies an array cmp() - Functional version of comparitors Example: if cmp(x , "n>", y) or: if cmp(x , "%n>", y) dump() - Shows entire array/variable contents. Mostly for debugging. Example: out dump(x, "/n") // output follows: foo = "Bill Jones" gork = 7.2 ~sys_zork = "blah" (etc....) each - used with for...each constructs eval() exec() - Execute a string as if it was code elements() - Number of array elements (excluding hidden) for has() - Boolean, if array has given element Example: has(anArray, "bar") if inherit - allows defined routine to inherit parent's scope. Ex: sub x(y) inherit my - scope specifier for innermost scope new - (reserved for possible future use) numericate() - makes subscripting more "number friendly" otherwise - used with case statements out - output outln - output with a linefeed (like C println) package (reserved for future use) return recent() - similar to 'this' but only returns the text name of the array remove() - remove element from array scope (for future implementation) select - define a case statement set() - set user-defined Hidden Entry slset() - set language-defined Hidden Entry static - define variables that do not reinitialize between subroutine calls. sub - declare a subroutine or function sys (reserved for future use) this - alias to the most recently referenced array varparse() - embedded string parsing while RESERVED HIDDEN ENTRIES [to be inserted later]
I call a draft of such a language "FIST". It is kind of based on the idea of LISP, but with the function name on the left of the parenthesis instead of inside. Everything in the language would be a function, including IF, End-IF, and function declarations or statements. This greatly simplifies its syntax structure. However, unlike LISP, the parenthesis are usually not "far reaching" for statement blocks. Blocks are marked by a set of two functions using an X/endX convention instead of parenthesis. Here is an example of some L code translated into FIST:
//*** L CODE *** myFunc(1, amt, "hey", #foo 7) // call a function sub myFunc(a, @b, c) print("foo=" & params.foo) if a %n> b print("total=" & a + b) end if r = b - a return(r) end sub //*** FIST CODE *** myFunc(1, amt, "hey", #foo 7) sub("myFunc") setParam("1", "a", true) setParam("2", "b", false) // not alterable setParam("3", "c", true) print(cat("foo=", params["foo"])) if(cmp(a, "n>", b)) print(cat("foo=", add(a,b))) endif() assign(r, subtract(b, a)) return(r) endsub()For debugging, a programmer could perhaps peek at the generated FIST to see how the interpreter interpreted the L code. It may also simplify the interpreter to only allow FIST code in EVAL-like operations. For one, FIST ignores line breaks (seen as white space instead) because the borders between functions are clear. This makes it easier to create and store strings that are programming code. You could say that L is simply a convenient short-hand for FIST.
x = cat(a, b)Instead of:
assign(x, cat(a, b))Although this breaks the "pure-function" rule, assignment is perhaps frequent enough to justify complicating syntax for it. It could also make Fist useful as a relational language foundation.