Design Notes

Introduction
Selector Language
Evolutionary Computation
Design Ideas
Population Operators
Multiple Grammars

Introduction

Let X be a time series of vectors such as the numeric vector x = #(num| xtime x1 x2 x3 ... xm), and let Y be a numeric vector of "score" values. Let both X and Y be of length N.

The time series, X, together with the "scores", Y, are used to train this learning machine. There is also a testing set, TX and TY, of vectors similar to those in X and Y but for the testing time period (a time period not present in X or Y). After training, the machine is presented with the testing set, TX, and outputs an estimator Lambda, Selector, which attempts to estimate ty in TY when presented with the corresponding tx in TX.

Running this Selector on every tx in TX returns a Vector EY, of estimates for TY. Each element in EY contains a numeric estimate for the corresponding value in TY. The Selector Lambda attempts to:

Make the natural ordering of EY be predictive of the natural ordering of TY
Minimize the least squared error between EY and TY

The internal model used by the Selector Lambda to compute ey is important. The range of possible X and Y training sets and the power of the possible estimates is determined by the internal model used in the Selector Lambda.

Linear Regression

Using the technique of linear regression the Selector Lambda could provide accurate estimates for all X and Y training sets which had been constructed from the following model, y = c0 + c1*F1(x), where c0 and c1 are internal numeric constants and F1 is an arbitrary numeric function on members of X. If the choice of F1 is known, linear regression can estimate the values of C0 and C1 quite accurately just by training on X and Y. If the choice of F1 is unknown, linear regression works very poorly.

Multiple Linear Regression

Using the technique of multiple linear regression the Selector Lambda could provide accurate estimates for all X and Y training sets which had been constructed from the following model, y = c0 + c1*F1(x) + c2*F2(x) ... + cm*Fm(x), where c0 through cm are internal numeric constants and F1 through Fm are arbitrary numeric functions on members of X. If the choices of F1 through Fm are known, multiple linear regression can estimate the values of C0 through Cm quite accurately just by training on X and Y. If the choices of F1 through Fm are unknown, multiple linear regression works very poorly.

Evolutionary Sequencing Regression

The Evolutionary Sequencing Regression uses the techniques of linear regression, multiple linear regression, support vector regression and percentile grid regression. The Selector Lambda can provide accurate estimates for all X and Y training sets which have been constructed from the following models, y = c0 + c1*E1(x) or y = c0 + c1*E1(x) + c2*E2(x) ... + cm*Em(x), where c0 through cm are internal numeric constants and E1 through Em are valid Selector language regression expressions. Even when the choices of E1 through Em are unknown, evolutionary sequencing regression works very well, performing natural order preservation as its first priority and minimizing the least squared error as its second priority. An explanation of the power and limitations of valid Selector regression expressions is contained in the next section.

Selector Language

The Evolutionary Sequencing Machine evolves a population of well-formed-formulas (WFFs) in a regression-oriented grammar known as Selector. Each WFF is known as a Selector, and is trained to map number vectors in X and XT to numbers in Y and YT. The Selector language is a dialect of JavaScript, to make learning and reading easy by the current generation of programmers. Each Selector Lambda, wff, supports the following methods.

wff(x) ==> ey : invoking the main wff maps a number vector, x, to a number, y.
wff.run(X) ==> EY : invoking wff.run outputs a number vector, EY, of estimates for Y.
wff.train(X,Y) ==> true : invoking wff.train trains the Selector wff on the specified training data.

The Selector grammar is defined, within the ESM Lambda, in esm:selector:%DECLARATION, which is a feature-based grammar specification understood by the parseLib. The esm.selector child Lambda is a "Selector" parser, generated from esm:selector:%DECLARATION by parseLib. The Selector parser translates ASCII text strings into Selector WFFs which are annotated s-expressions. Each WFF s-expression, in the population, may be annotated with grammar notes taken from the rules defined in esm:selector:%DECLARATION.

Each Selector WFF must chose one of the following regression model syntax to remain gramatically correct.

Linear Regression

         
             regress expression;         

                         for example:

             regress (x10 / sin(x12));

Trains a linear regression Selector WFF for the model expression. The expression must be a valid Selector regression expression.

Support Vector Regression

         
             svmregress(expression1, expression2,..., expressionM);         

                         for example:

             svmregress (x10,cos(x12)/log(x3));

Trains a support vector regression Selector WFF, with a cubic kernel, for the models expression1 through expressionM. The models expression1 through expressionM must each be valid Selector regression expressions. There must be AT LEAST one model expression and there may be up to M model expressions (where M is the number of elements in each x vector). The special case of svmregress(); is assumed to mean svmregress(xtime,x1,x2,...,xm);

Percentile Grid Regression

         
             pgmregress(expression1, expression2,..., expressionM);         

                         for example:

             pgmregress (x1,exp(x12)/log(x3));

Trains a percentile grid regression Selector WFF for the models expression1 through expressionM. The models expression1 through expressionM must each be valid Selector regression expressions. There must be AT LEAST one model expression and there may be up to M model expressions (where M is the number of elements in each x vector).

Basis Grid Regression

         
             bgmregress(expression1, expression2, expression2, expression4);         

                         for example:

             bgmregress (x1,exp(x12)/log(x3));

Trains a basis grid regression Selector WFF, with a basis grid, for the models expression1 through expressionM. The models expression1 through expressionM must each be valid Selector regression expressions. There must be AT LEAST one model expression and there may be up to four model expressions.

Selector Expressions

Valid Selector regression expression well-formed-formulas (WFFs) can be constructed by recursively applying the following production grammar rules.

             TERMINAL: xtime x1 x2 ... xm

                                  explanation:

                              A TERMIAL is any one of the elements of the input vector, x.

                                  example:

                              svmregress (xtime, x3, x10);

             UNARY:    abs(EXP) cos(EXP) cube(EXP) exp(EXP) integer(EXP) log(EXP) (- EXP) sign(EXP) sin(EXP) sqrt(EXP) square(EXP) tan(EXP) tanh(EXP)

                                  explanation:

                              A UNARY is any one of the unary operators, shown above, enclosing a valid Selector regression expression WFF.

                                  example:

                              svmregress (sin(xtime), abs(x3), log(sqrt(x10)));

             
             BINARY:   avg(EXP1,EXP2) (EXP1+EXP2) (EXP1-EXP2)  (EXP1/EXP2)  (EXP1*EXP2) expt(EXP1,EXP2) max(EXP1,EXP2) min(EXP1,EXP2) mod(EXP1,EXP2) 

                                  explanation:

                              A BINARY is any one of the binary operators, shown above, enclosing two valid Selector regression expression WFFs.

                                  example:

                              svmregress (sin(xtime)-abs(x3), log(sqrt(x10))*x2);

             
             EXP:      TERMINAL UNARY BINARY  

                                  explanation:

                              An EXP is any WFF resulting from invoking the TERMINAL, UNARY, or BINARY Selector grammar rules up to five times recursively.

                                  example:

                              pgmregress (sin(xtime)*abs(x3), tan(sqrt(x10))*x2);

A more detailed description of the Selector language is provided in the chapters on Selector Introduction and Selector Statements.

Evolutionary Computation

There are many specialties in the umbrella field of Evolutionary Computation including (but not limited to): Genetic Algorithms; Genetic programming; Artificial Life; Grammatical Evolution; Evolvable Hardware, etc. However different, these various specialties all have a least common denominator of similarities; and, each of these specialties possess at least some techniques which have proven useful when applied to selected problem domains.

Populations

One of the primary least common denominator areas across all specialties in the umbrella field of Evolutionary Computation, is the commonality of large populations of things which change, over time, becoming a population of fitter things. In the specialties of Genetic Algorithms, Genetic Programming, and Grammatical Evolution, the things are called genomes. In other specialties the things are called Lambdas, organisms, robots, rules, DNA sequences, etc.

Fitness

Another important least common denominator area across all specialties in the umbrella field of Evolutionary Computation, is the commonality of fitness. The population is said to become fitter over time. In each specialty, the concept of fitness is expressed via a mapping from things in the population to a fitness score. In our observations, of each area of specialty, the concept of fitness mapping can be generalized to a mapping from elements of a population of things to a real vector space.

Survival

Survival is another least common denominator area across all specialties and is related to the fitness measure in some manner. In all specialties there are discrete time steps during which the things in the population are promoted (survive) into the next time frame. Normally, this survival mechanism is related to the fitness measure in that the population becomes fitter over time.

Operators

In almost all areas of specialty which we studied the concept of population operators morphing the population is a least common denominator. In the specialties of Genetic Algorithms, Genetic Programming, and Grammatical Evolution, the operators are called mutation, recombination, reproduction, etc. In many cases the population operators incorporate the concept of survival. In these architectures, the population operators map things (in the current population) into things (in the population one time step in the future). Things which are not promoted do not survive.

Data

All areas of specialty which we studied supported the concept of training data in one form or another. In many cases the training data can be numeric vectors. In other cases the training data is the initial environment for an ant colony, or a geographic playing field for robot traversal, etc. In all these architectures, the training data, population operators, and fitness mapping work together to produce populations of things which better solve a training problem. In many architectures, there is a concept of problem generalization. In these architectures, once trained, things in the population can be presented with testing data (not seen during the training period) and better solve a testing problem (presumably because the previous training experience was generalized).

Evaluation

All areas of specialty which we studied supported the concept of evaluation in one form or another. There must be some process by which each thing in the population can be evaluated so as to produce a fitness score. In the specialty of Genetic Programming, the genome is often an Lisp s-expression and the evaluation is the Lisp eval function. In the specialty of Genetic Algorithms, a translator is normally required to convert the genome into a program which can be evaluated. In the specialty of Grammatical Evolution, the genome is run through a grammar production resulting in a program which can be evaluated.

Design Principle

The Evolutionary Sequencing Machine Lambda (ESM) is a symbolic regression engine, designed to accept an X Y training matrix as input and to output an estimator Lambda (Selector) which attempts to map elements in X onto elements in Y, in an order preserving manner, with a high degree of accuracy. In order to accomplish this task, the Evolutionary Sequencing Machine Lambda (ESM) is designed, to take advantage of practical research results in the general field of Evolutionary Computation, regardless of their specialty of origin.

Design Ideas

The Evolutionary Sequencing Machine (ESM) is designed, from its basic foundation, to take advantage of any practical research results in the general field of Evolutionary Computation, regardless of the specialty of origin. The Evolutionary Sequencing Machine architecture is a merger of a wide range of evolutionary techniques and theories from throughout the whole field of Evolutionary Computation.

Population of WFFs

The Evolutionary Sequencing Machine evolves a population of well-formed-formulas (WFFs) in a problem-oriented grammar known as Selector. The Selector grammar is defined, within the ESM Lambda, in esm:selector:%DECLARATION, which is a feature-based grammar specification understood by the parseLib. The esm.selector child Lambda is a "Selector" parser, generated from esm:selector:%DECLARATION by parseLib. The Selector parser translates ASCII text strings into Selector WFFs which are annotated s-expressions.

By implementing its genomes as annotated s-expressions, the Evolutionary Sequencing Machine takes advantage the large body of Genetic Programming tools while also facilitating the use of many interesting techniques available in other specialties. Bit string annotations to the WFF s-expressions, facilitate the use of many interesting techniques available in Genetic Algorithm. Grammar annotations to the WFF s-expressions, facilitate the use of many interesting techniques available in the Gramatical Evolution. Goal and plan annotations to the WFF s-expressions, facilitate the use of many interesting techniques available in MultiLambda Systems and Swarm Learning. Nevertheless, since each WFF in the population is essentially an s-expression, evaluation requires only a call to the AIS Lisp eval function.

Operators on WFFs

The Evolutionary Sequencing Machine raises the grammar concepts embodied in Grammatical Evolution far above the WFF genome's grammar annotations, making the understanding and preservation of WFF grammar an essential task of the population operators. Thus ESM operators can understand that two WFF s-expressions cannot be combined because their are grammatically incompatible, that two ESM WFF s-expressions must be combined in specific ways because of their grammar annotations, that two ESM WFF s-expressions are essentially equivalent, or that two ESM WFF s-expressions can be combined and reduced to a more primal form.

Furthermore, the task of fitness improvement is raised far above any requirement for operational purity. Evolutionary Sequencing Machine WFFs can be annotated with a wide variety of information to help the population operators raise the population fitness level. Annotations can aid operators in establishing protected or unprotected sub-populations. Annotations can aid operators in enhancing populations with WFFs generated via exhaustive parital-search, statistical learning methods, etc. In general the ESM uses annotations to extend the style and type of population operator far beyond those listed in any one specialty while requiring only that the operator be grammar preserving.

Data

The Evolutionary Sequencing Machine learns to select and score the best individuals from a universe of individuals over time. Over a series of discrete time steps, a universe of individuals is collected for each time step (the training data). During prediction the ESM attempts to select the best individuals from the testing data (the ESM is NOT given the "score" values for the testing data). The individuals are Number vectors and represent quantitative information about things such as Stocks, People, Cities, etc. The score values are single Numbers and represent some quantitative value about these same Stocks, People, Cities, etc.

Each individual followed by the system is given a unique identifier which remains unique across all time periods studied (no two individuals ever have the same identifier). Furthermore, each time period studied is given a unique ascending integer index (i.e. week 1, week 2, etc.). So, for a series of time periods, historical information about groups of individuals is collected for each time period. The historical information collected for each individual for each time period is stored in a Number Vector and includes: the time period index; the unique identifier of the individual; and other numeric information about the individual pertinent to the current investigation. Finally, each individual in each time period is given a numeric "score" which determines the value of the individual in that time period. The "best" individuals have the highest "score" values.

Fitness of WFFs

During training, the ESM is given historical information and the "score" values for time periods 0 through T for all individuals. For each time period 0 through T, we can calculate the average score for both the theoretic best and worst selections of N individuals. The ESM currently computes the theoretic best and worst scores for the each of the top and bottom 5% partitions (5%, 10%, 15%, ... 45%, 50%). The order preserving "fitness" of a Selector WFF are measured by what percentage of the theoretical "best" averages it obtained on each of these top and bottom percentile partitions. In addition, the ESM also computes the least squares error between the predictions and the "score" values for each individual.

During training, each WFF in the population (without knowing the score values), attempts to score the N individuals for time periods 0 through T. Associated with each WFF in the population {w in P} is the least squares error and the order preservation measure for time periods 0 through T. These two fittness measures are normalized into a single fittness score as follows S[w,t] = g(Error[w,t],Order[w,t]).

The fitness measure for each WFF {F[w] == (S[w,0] * S[w,1] * ... * S[w,t])} is the product of its normalized scores for time periods 0 through T. The fitness measure tends to favor those WFFs which have done consistantly well during the training period (as opposed to those with only an occasional big success).

Population Operators

The Evolutionary Sequencing Machine applies a set of grammar preserving operators to evolve the population of well-formed-formulas (WFFs) into a population of fitter WFFs. The population operators are grammar preserving such that each operator maps from a population of Selector WFFs to a population of Selector WFFs. The understanding and preservation of WFF grammar is an essential task of the population operators. WFFs can be annotated with a wide variety of information to help the population operators raise the population fitness level. Annotations can aid operators in establishing protected or unprotected sub-populations evolved with different grammars. Annotations can aid operators in enhancing populations with WFFs generated via exhaustive parital-search, statistical learning methods, etc.

Here follows an overview of the population operators supported during Evolutionary Sequencing Regression.

The grow operator generates a new WFF, from the empty set, by randomly invoking Selector grammar rules so as to create a WFF.

The mutate operator generates a new WFF, from a parent WFF, by randomly mutating the parent WFF in a grammar preserving manner.

The children operator generates two new WFFs, from two parent WFFs, by randomly invoking Selector grammar rules so as to cut and recombine child WFFs in a grammar preserving manner.

The greed operator generates a new WFF, from the empty set, by using greedy search to invoke Selector grammar rules so as to create a WFF.

Multiple Grammars

The Evolutionary Sequencing Machine supports multiple grammars in evolving the population of well-formed-formulas (WFFs) problem solutions. All ESM population operators are grammar preserving such that each operator maps from a population of Selector WFFs to a population of Selector WFFs, in a grammar preserving manner. The design concept of grammar preservation extends to establishing protected and unprotected sub-populations evolved with different grammars. The understanding and preservation of multiple WFF grammars is an essential task of the Evolutionary Sequencing Machine. The Evolutionary Sequencing Machine certainly attempts to evolve general programs, with evolved functions and variables; but, there are also attempts to evolve program solutions using Support Vectors, Multiple Regression, and Decision Tree models. In fact the ESM supports the co-evolution of a wide variety of Selector WFF grammars, providing potentially unique strategies for approaching the problem.

There Selector grammars currently supported during Evolutionary Sequencing Regression.

The REG grammar forms valid Selector WFFs which train and run a simple linear regression model on a the training data set.

The MVL grammar forms valid Selector WFFs which train and run a multiple linear regression model on a the training data set.

The SVM grammar forms valid Selector WFFs which train and run a support vector regression model with a cubic kernel on a the training data set.

The BGM grammar forms valid Selector WFFs which train and run a basis grid regression model on the training data set.

The FRM grammar forms valid Selector WFFs which train and run a multiple factor regression model on the training data set.

The ENN grammar forms valid Selector WFFs which train and run a neural net regression model on a the training data set.