The principle benefit of representing a data sequence with a proportional walk is that less of the information in the data is lost in translation. A large amount of information is lost in statistical curve fitting because only the constant features the researcher thinks of including are present in the structure of the formula. Proportional walks include all kinds of dynamics, of known and unknown origin, including transients and behavioral transitions on multiple scales. It represents them in a form that is more readily comprehended by direct inspection, though an experienced observer will be then able to see many of the same features by direct inspection of the data itself.
By including the transients and behavioral transitions proportional walks generated from a sequence serve to identify behavioral changes that would require new equations to describe. This provides an efficient kind of hypothesis generator regarding the structures of the natural phenomena being observed.
DR functions take a numerical sequence as input and produce a corresponding sequence of values as output, drawing a curve from a curve. The analytical package is available as a collection of AutoLISP routines called CURVE for use in the graphical database AutoCAD. The principle difference from other curve fitting techniques, such as the least squares autoregressions, is that DR fits the curve according only to the smoothness of the path, and ignores entirely its distance from some preconceived mathematical curve. Thus it produces curves approximating both the scale and the dynamics of the data, not just getting to similar points, but also getting there in similar ways.
To do this one needs to learn how a derivative is defined in functions and how to adapt that definition to sequences. In the absence of other reason to believe that a sequence reflects the derivative continuities of a physically continuous process, one needs statistical measures to determine if a sequence displays that pattern, and that the presence of a physically continuous process is implied. Once a sequence is represented by a proportional walk various tests can be used to measure how well it represents the data or determine its dynamic and scalar similarity to other results.
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
For mathematical functions having a derivatives is well defined, based on whether the rates of change of a function approach the same value at points successively closer to a given point from both sides. This is also known as the test for derivative continuity. The criteria for derivative continuity is much more restrictive than for simple continuity. The latter only requires that a function be defined for all values of the variables (not having gaps in the coordinates of the variable) and that the values of the function approach each other when its variables do (not having gaps in the coordinates of the function). Derivative continuity in a function also means not having abrupt rates of change (not having gaps in the accelerations), i.e. following a smooth curve. These things have been very well worked out for a long time (Courant & Robbins 1941).
The problem with extending this concept to either physical processes or sequential measurements of them is with the gaps in nature and in measurement. Both data and physical processes are completely fragmented. Every measurement is an isolated value with no ultimate near-by values approaching from any direction. Physical processes are much the same. Surfaces are mostly composed of holes, lines of spaces, and regular behaviors of intermittent smaller scale processes. Nature and all our information about it is largely composed of gaps, broken chains presenting the regularities of the world as completely discontinuous.
There are also lots of sequences that appear to flow so smoothly it's hard to see it anything else, like a movie. There is also the marvel of classical physics, that nature's apparent fragmentation can be considered as if following perfectly continuous differentiable functions. It is even possible to derive from the conservation laws a principle that all physical processes must, at root, satisfy differential continuity (Henshaw 1995). Even for quantum mechanics, discounting that the principle concerns of QM are probabilistic events beyond the realm of physical process, it now seems that the quantum mechanical events that do materialize may still conform to classical mechanics (Lindley 1997). This suggests that not only is QM perhaps consistent with the continuous world, but might also require it, and the differential continuity of physical properties that classical mechanics implies.
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
The principle strategic task is to correctly identify where the rates of change of the underlying behavior reverse. If the behavior is expected to have been smoothly changing, but there are few data points, most of the inflection points in the behavior will have occurred somewhere in-between. The first curve to construct would then be one keeping the original data points and adding new points were a they would be predicted given the assumption of there being a regular progression of derivative rates. The function that does this is called DIN, for derivative interpolation.
If, on the other hand, there is an abundance of data containing fairly clear trends but small scale erratic variation hides all the larger scale inflection points, then either one or another kind of local averaging might be used as the first step. The least distorting kind of local averaging is double derivative smoothing, DDSM. Both DIN and DDSM work by comparing the third derivatives calculated from the first four of five adjacent points with that calculated from the last four of the same five points, and adjusting the middle point to make the two third derivatives equal.
Once the best possible representation of the smallest scale of regular fluctuation is constructed the next larger scale of fluctuations in the data is isolated by using TLIN to draw a curve through the inflection points of the small scale fluctuations, constructing a dynamic trend line. This might be followed by subsequent use of DIN and DDSM, and then repeated, until the resultant is a smooth monotonic centroid, a curve without fluctuations that closely approximates both the scale and dynamics of the original data, its central dynamic trend. That completes the first major step. The derivatives of this curve will display a number of definite predictions about the nature of the physical behavior being studied.
More information on the individual command operators
is found in drtools.pdf
,
a selection of DR commands in AutoLISP are available in Curve.zip
.
(for AutoCad 13 or earlier)
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
Experience with the method can certainly help, beginning with a study of the available examples. Two techniques for gaining confidence in the statistical accuracy of the results are below.
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
For
example, a clear difference is seen between the log/log plot of step variance
to step length for random walks and the Malmgren data on plankton size.
The numerical tests indicate that about 95% of Random walks will have between
.65 and 1.25 for the slope of step variance to step length, and the malmgren
data has a value of .3. this indicates that, in this case, the trippling
of plankton size which the data records, in all likelihood, progressed
by non-random steps. This test was developed in the JMP statistical
package (Jr. SAS) and the set of functions are availale in JMP format
from StepVar.zip
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
A second example of using sub-sets to validate
results is available from another study, to see the effect of the imperfect
treatment of end points in the sequence. DR routines usually retain end
points on a curve, with lower confidence, by making assumptions about imaginary
data points beyond. In modeling of the history of economic growth presented
in "Reconstructing the Physical Continuity of Events", ( GNP
)
about 10 data points from the end of each curve segment were shown to be
have low significance (figure sE.5 GNP10.gif
)
but these end condition effects had no impact whatever on the central portions
of the curve.
Another example is provided by the comparison of different subsets for the gamma ray burst data.
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
Derivative area ratios provide a potentially very useful tool for measuring the combined scalar and shape similarity of two sequences. The principle is that two curves which have equal areas are similar in overall scale, and if all their derivatives have equal areas, they are dynamically similar too.
There are several developments which need to occur for this to be made fully reliable. The intent was to have a statistical measure which would find whether a simplified shape (a reconstructed dynamic mean) was faithful in representing the accumulation of accelerations of the original sequence.
In principal, if one curve has fluctuating higher derivatives that cancel each other out in the accumulation of their effects, it is found to be equivalent to a curve that gets to the same place without all that fluctuation. If that is true throughout the range and for all derivatives, then the two curves are dynamically very similar indeed.
A number of demonstration charts were developed to show how various types of perturbation effect the results. They confirm the principle that curves with the same areas for underlying derivatives are dynamically similar.
Details of DAR bar chart are calculated based on the signed area under a curve. Thus DAR's can be both positive and negative, indicating both a magnitude and a direction of correlation.
Demonstration charts showing how various types of perturbation effect the results:
One of the interesting apparent indicators of dynamic similarity is DAR values which decline regularly for higher derivatives, rather than erratically. Another interesting pattern frequently seen is a near perfect match for the area of the curve and its first derivative (levels 0 and 1) and near zero correlation for higher derivatives.
Derivation:
DAR (S1,S2)= A(S1)/A(S2) 1.1
for sequences S1 and S2 and A( ) the sum of their areas. To make it more easily interpreted the ratios for each level are graphically represented, with a magnitude comparison so that each ratio for each derivative level is calculated with the smaller absolute value as numerator. This makes a bar chart of values between 1 and -1.
for the underlying derivatives the measure is the same:
DAR(S'n1,S'n2) = A(S'n1)/A(S'n2) 1.2
where a DAR is calculated for some or all of the calculable derivatives of the sequence. Variations on the DAR measure might also be developed to indicate regional equivalence within a curve, rather than comparing the accumulation of change over the whole curve
DARr(S1,S2) = sum'R(A(S1)/A(S2)) 1.4
where R is a rule to define regions.
For a five level weighted average each area ratio is multiplied by its level, and the sum divided by the number of levels. This gives greater weight to the derivative levels that are harder to match, calculated by sum(level*DAR)/(1+2+3+4+5) for the wAV 'number' for a five level DAR chart with a range of +1 to -1. The wAV 'indicator' is printed as a large type integer in the DAR bar chart box. It is 100 times the wAV number, for the first series of positive DAR values only, with a range of 0 to 100.
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
Intro, Derivatives, P-Walks, Basics, Tests, StepVar, Sub-Series,DAR, Craft,.... TOP
P. Henshaw 1998 DR main page