Program COFECHA Crossdating and measurement quality control USERS MANUAL for Program COFECHA by Richard L. Holmes Laboratory of Tree-Ring Research, University of Arizona, Tucson, Arizona USA February 1999 Adapted from Quality Control of Crossdating and Measuring: A Users Manual for Program COFECHA, in Tree-Ring Chronologies of Western North America: California, eastern Oregon and northern Great Basin, by R. L. Holmes, R. K. Adams and H. C. Fritts, Laboratory of Tree-Ring Research, University of Arizona, 1986, pages 41 to 49. INTRODUCTION Program COFECHA performs data quality control on a set of tree-ring measurements, verifying crossdating among ring measurement series and indicating possible dating or measurement problems. The printout from COFECHA provides documentation demonstrating the quality of crossdating within a tree-ring site. Program COFECHA serves as a tool for the identification and documentation of portions of a tree-ring data set that may have dating errors or important errors in measurement. It may also be used to check crossdating among chronologies from sites within a region. For each series a note is made of segments which correlate poorly with the corresponding segments of the master dating series (the mean of all other series) or which correlate higher at a position other than the position as dated. Single values are noted which have the effect of strongly lowering or raising the correlation of the series with the mean of all other series. Divergent year-to-year changes, absent rings and statistical outliers are listed. Basic statistics for each series appear in a table. If there are series of difficult or questionable dating you may find their probable dating by putting them in a second data file. If the dating for the site is unknown, you may determine preliminary best-fit relationships among the series by giving their file name as the second file with a blank name for the first file. At many research centers Program COFECHA has saved a great deal of personnel time by providing reliable quality control and archival documentation of crossdating. It may be especially useful to an investigator working alone or in a small group, or in dealing with unfamiliar species. RUNNING PROGRAM COFECHA COFECHA will ask for the name of the file containing dated tree-ring measurement series. Next it will ask for a second file of undated or counted measurement series, which will be examined separately. If there are series of difficult or questionable dating you may find their probable dating by including them in this second data file. If the crossdating for the site is entirely unknown, preliminary best-fit relationships among the series may be determined by including them in the second file, giving a blank name (respond with alone) for the first file. If no file of undated measurements is to be examined, touch without giving a file name. You may give a title to identify this run of the program. Either upper or lower case letters may be used in responding to prompts. The program may be terminated at any prompt by typing a slash ("/") and . A menu shows the current setting of parameters for running the program. Any of these values may be changed by first typing the number appearing at the left, then responding with the modification desired. When no more changes are to be made, touching alone begins processing of the data. Often very few or no changes need be made to the default values. Menu for Program COFECHA: Select number or first letter to modify: Current values 1 Rigidity of SPLINE for filtering 32 2 SEGMENT length to examine 50 lagged 25 3 AUTOREGRESSIVE MODEL A 4 TRANSFORM series to logarithms Y 5 CRITICAL level of correlation 0.3281 6 MASTER dating series, save N 7 LIST ring measurements N 8 Parts of output to print 1234567 Brief messages on the screen are intended to keep you posted on the progress of the program. On termination of the program a summary message is shown. The results for printing are in the file ABCDECOF.OUT (assuming ABCDE is the job identification). WHAT PROGRAM COFECHA DOES Before crossdating and measurement problems are identified the data are transformed by the program so as to enhance those time-series characteristics which are related to crossdating, while minimizing the features unrelated to the task. The following steps are performed on each series of tree-ring measurements: (1) A cubic smoothing spline with 50% cutoff of 32 years is fit to the series, and each value of the series is divided by the corresponding value of the spline curve, resulting in a series without trend or long waves and with a mean of 1. In short, low-frequency variance is removed from the series. If you give a negative number for spline rigidity, there is no transformation of the series: no spline, no autoregressive modeling and no log-transform. Experience with data sets from several regions suggests that the optimum job of discovering errors is done by using a smoothing spline with 50% frequency response of around 32 years. A more rigid spline may leave too much long-term variance in the series, and the resulting filtered series may still contain information unrelated to the dating pattern. (2) The persistence of the smoothed series is removed by autoregressive modeling, thus removing short waves which may remain after the spline fit. This step makes the series conform more closely to the assumption of the Pearson correlation that the values are serially independent; the crossdating match stands out more sharply thereby. Robert Monserud (1986) makes an interesting analysis of this concept. Autoregressive modeling decreases the effect of varying the spline frequency response, although a very flexible spline (less than about 20 years) may add spurious high-order autocorrelation to the series. This step may be omitted at the user's option. (3) The logarithm of each value in the series is taken after adding a constant of one-sixth of the mean. The constant is added to avoid the possibility of taking the logarithm of zero (which is negative infinity) in the case of a locally absent ring. The aim of the log transform is to weigh proportional differences in ring measurement more nearly equally; a minor disadvantage is that after log transformation the distribution of values in the series is negatively skewed. This step may be omitted at the user's option. Filtering with a smoothing spline, modeling and log-transformation, by removing low-frequency variance and persistence and examining only the high-frequency variance proportional to ring widths, mathematically simulates human perception on visual examination of a ring series for crossdating. (4) The transformed measurement series is saved on a direct-access file for subsequent processing. The series is added to an accumulating series and a counter series is incremented for the time interval. (5) After all series have been transformed, the accumulated series is divided by the counter series to give an arithmetic mean value function of all transformed dated series. The resulting master dating series is intended to embody the crossdating characteristics of the site, and may be saved for further analysis. (6) Each transformed series is tested against the master dating series. The master series is first adjusted by removing the component contributed by the series under consideration to avoid comparing it against itself. The series is tested segment by segment against the adjusted master series for crossdating and general measuring accuracy, by calculating correlations for each 50-year segment of the series under examination with the master series matched at the point of crossdating. For each segment the correlation is verified to be positive and significant at the 99% level. The correlation is also checked to see that it is higher when matched as dated than at any position shifted up to ten years earlier (-10) or later (+10) from the dating. Experience indicates that ten years on either side is adequate to locate most crossdating errors, and will also catch errors made by skipping or repeating a decade while measuring. The default critical level of correlation below which segments will be flagged are: Length of Correlation at segment 99% confidence level 10 0.7155 15 0.5923 20 0.5155 25 0.4622 30 0.4226 35 0.3916 40 0.3665 50 0.3281 60 0.2997 70 0.2776 80 0.2597 90 0.2449 100 0.2324 120 0.2122 Successive segments are lagged 25 years, giving a 50% overlap. In order to test to the ends of the series, the first segment begins with the first year of the series and the last ends with the last year; all segments are of the same length. Intermediate segments begin on years evenly divisible by the lag. The overlap of the first two and the last two segments is therefore usually greater than the lag. A segment length of 50 years provides sufficient degrees of freedom so that there are few segments where very high or low correlation occurs by chance, and the correlation at 99% significance is low enough that few segments are flagged unnecessarily. Yet 50 years is short enough to allow detection of dating errors of a few years in length, and thus allow the dendrochronologist to narrow the search for dating problems. The length of segments may be decreased for short series (but less than 30 years is not recommended) or lengthened to 100 years or more for long series in species with relatively weak crossdating or widely separated key crossdating years such as Sequioadendron giganteum. If in any time interval a major proportion of the series that make up the master series are incorrectly dated, the master series itself may not contain the correct dating pattern, and most or all of the series will show low correlation for that interval. Test runs of the program show that if there are sufficient samples, more than half may be erroneously dated in a given time interval, so long as they are not systematically misdated in the same way, and the program will still correctly identify those series containing errors while not flagging the correct series. Thus the inclusion among the dated series of some with severe errors, though not to be preferred, generally does not destroy the dating pattern. The level of correlation among correctly crossdated ring measurement series may differ with tree species, geographic area, site homogeneity, amount of stand competition, and degree of disturbance. Through time a given tree may suffer differing amounts of stress from competition with other trees for light and moisture, competition for moisture with ground cover, access to soil by the roots, and disturbances such as fire and insect attack. This could cause the tree growth pattern through time to become either more similar to or more divergent from that of other trees. For these reasons, Program COFECHA does not provide precise accept/reject criteria for making objective decisions as to whether a series has been crossdated correctly throughout, but rather is to be used as a tool to assist the researcher in verifying the dating and measurement accuracy. Because visual characteristics of tree rings contain many clues to crossdating in addition to ring width, the program results should not be used as a substitute for visual crossdating on the wood sample. COFECHA is intended to assist data quality control by thoroughly examining all series from the first to the last value (the end of a series which extends beyond all others cannot be checked). It thus gives the dendrochronologist an independent tool to confirm the accuracy of dating and measurement. It may be used to assist in deciding to accept or reject series or portions of series for inclusion in a site chronology or for other analyses. At the end of a run of Program COFECHA a brief summary appears; the summary is also printed on output: **************************************** *C* Number of dated series 40 *C* *O* Master series 1696 1990 295 yrs *O* *F* Total rings in all series 7096 *F* *E* Total dated rings checked 7068 *E* *C* Series intercorrelation 0.643 *C* *H* Average mean sensitivity 0.288 *H* *A* Segments, possible problems 7 *A* **************************************** PRINTED OUTPUT FROM PROGRAM COFECHA Output from Program COFECHA appears in seven or eight parts: Part 1: Title page, options selected, summary, absent rings by series Part 2: Histogram of time spans Part 3: Master series with sample depth and absent rings Part 4: Bar plot of Master Dating Series Part 5: Correlation of each series with Master Part 6: Potential problems: low correlation, divergent year-to-year changes, absent rings, outliers Part 7: Descriptive statistics Part 8: Undated series adjustments for highest correlations (if there is a file of undated series) The following paragraphs describe the results printed on the output. PART 1. The cover page shows the name(s) of the data file(s), the run title, and a list of contents of the printout. The menu is printed showing the options selected and below it is a brief summary of the results and a count of absent rings by year. On the following page a histogram shows in graphic form the time span covered by each dated series. Next, the master series in normalized form (mean = 0, standard deviation = 1) is listed along with the number of individual series contributing to the value for each year. Following this is a bar plot of the master series which serves as a visual aid to verification of crossdating. The bar plot is similar to a skeleton plot, but wider rings are indicated by longer bars. The relative width of all rings is shown along with an alphabetic code; lower-case letters indicate rings narrower than the local mean, upper-case wider than the local mean. The symbol "@" indicates that the value is very close to the local mean; each letter progressing through the alphabet indicates an additional quarter standard deviation from the mean. The bar plot may assist in finding the exact year of problems in crossdating or measurement. PART 2. Correlations of each segment of the series with the master are printed in a table. Given a segment length of 50 years, correlation values are underlined and flagged with "A" if they are less than 0.3281, representing the 99% confidence level of significance in a one-tail test of the distribution of the correlation coefficient with 48 degrees of freedom. Correlations are flagged with a "B" if a correlation at some position other than as dated gives a higher correlation with the master series. At the right margin are the number of flagged correlations and total number of segments for the series. PART 3. Potential problems in dated series are reported here. All information pertaining to questions about a given series appears together. [A] A line is printed for any segment which correlates higher at some position other than where it was dated, or which correlates below the 99% confidence level. This line shows the correlation of the segment at each position from -10 to +10. The value as dated (position +0) is underlined, and the highest correlation is underlined and bracketed. The position of highest correlation is also printed in the column labeled "High". For clarity, an open dashed line separates non-consecutive segments. Crossdating may be erroneous in the segments listed. Crossdating errors are often indicated by the occurrence of a low correlation at the dated position (+0) and a much higher correlation at some position near the dated position, for example at +1, -1, +2 or -2. If the misdating continues for more than a few rings, two or more successive segments may correlate higher at the same nonzero position. A value of -2, for example, suggests that moving the dating back two years will give a higher correlation; possibly two rings (locally absent?) may not have been recorded at the later end of the flagged segments. If there are unflagged segments prior to the flagged ones, two extra rings may have been recorded (double or false?) at the early end of the segments. If the highest correlation is at position +10 or -10, the measurer may have skipped a decade or repeated it. [B] For the entire series and for segments listed in [A] above, the effect on the correlation with the master series is listed for the rings whose presence most lowers or raises the correlation. A ring that either lowers or raises the correlation of a segment, particularly if its absolute value is greater than about .07, may indicate a measuring error or an especially wide or narrow ring that is misdated. [C] Year-to-year differences in ring measurement are shown where they diverge by 4.0 standard deviations or more from the mean of the year-to-year differences of the other series for the same pair of years. This information may help to locate problems to the exact year. [D] Locally absent rings (years with zero-value measurements) are listed. [E] Rings are listed which are statistical outliers, defined for this purpose as being more than +3.0 standard deviations larger or more than -4.5 standard deviations smaller than the mean of the other series for that year. These individual rings are possible sources of dating or measurement error. The listing of a segment or a ring in this section indicates that there may be an error in crossdating or one or more large errors in measuring. Most measurement errors will have the effect of lowering the correlations of the segments in which they occur. Listed segments may be candidates for remeasurement to check for errors in the original measurements. Dates of locally absent rings (zero values) should be independently confirmed since they are determined by judgment rather than from direct observation. A disturbance to the growth of the tree may produce a listing. A fire or other disturbance, sudden removal of competition, severe insect infestation or other environmental changes abruptly affecting the tree in question differently from others in the stand, may cause ring growth to be anomalous for one or a few years, and thus produce low correlation in one or two segments of the series and divergent year-to-year changes. This phenomenon was noted by L. O. White (personal communication), who observed in his collection of Pinus lambertiana from the Mendocino National Forest in California that evidence of fire often occurred within segments of measurement series with somewhat low correlation in Program COFECHA; these segments were nevertheless correctly dated. We recommend a close examination of Part 3 of the output to confirm correct crossdating and to select those portions of series in which the dating and measurement should be checked. After corrections are made, Program COFECHA should be run on the clean measurement data set to confirm and document the correct crossdating of the site collection. PART 4. This is a table of descriptive statistics of the ring measurement series. Included are the total number of segments in each series and how many segments were found to have potential problems. The mean correlation of the series with the master series is given, along with standard time series statistics of the measurement before and after transformation, including the order of the autoregressive model (AR) applied to the series. PART 5. Date adjustment for unknown series. If there is a second data file with undated ring measurement series, a section appears whose purpose is to find the most probable dating of the unknown series which cannot be confidently dated by skeleton plot or other commonly used techniques. Tentatively dated series may be included here. Possible crossdating matches for these series are indicated. This section is very similar in concept to M. L. Parker's (1967 and 1971) Shifting Unit Dating Program. For the unknown series as with the dated series, correlations are calculated for 50-year segments of the counted series lagged successively 25 years, but here the segments are tested at every position from beginning to end of the master dating series. For each segment the eleven highest-correlating positions are shown (the eleven best matches), starting with the highest ("Corr #1"), along with the number of years to add to the counted series to obtain the indicated match. If the same number appears consistently in one of the "Add" columns of the #1, #2 or #3 correlation there is a high probability that the correct dating of the series may be obtained by adding this number to the count of each ring. The dating should of course be verified on the wood sample, since there are many clues to crossdating in addition to ring width. At the end of the section dealing with a series is a tabulation of the segment adjustments which appear three or more times with their mean correlation. Further discussion is given by Holmes et al. (1986) in sections titled "Crossdating, measurement and related procedures" and "Effects of undiscovered absent rings," and in Appendix 1.