A dissertation submitted in partial fulfillment of the
requirements for the degree of Doctor of Philosophy,
in Geophysics
by
Dr. John N. Louie, Dissertation Advisor
May, 1998
``Journey in the Land, and then behold how
He originated creation''
The Quran
``Yet all experience is an arch wherethro'
Gleams that untravell'd world whose margin fades
For ever and for ever when I move.''
Alfred Tennyson
All human experience stems from inspiration and cooperation from others. The experience of going through the graduate school and reaching the end is no exception. Before mentioning individuals by name, I would like to appreciate all the unnamed people who inspired me all through my life.
I would like to express my heart-felt gratitude and appreciation to my dissertation advisor, Dr. John N. Louie whose unselfish support and positive attitude always helped me through my most difficult graduate days. His overall receptiveness to new ideas and willingness to examine them were always refreshing to me. I appreciate the research funding through the years that he provided me thus enabling me to complete graduate school. I can never appreciate John enough for his understanding of the human problems of his students.
I thank Drs. John Anderson, Jim Brune, Sushil Louis and Yehua Zeng for serving on my thesis committee. I thank all of them again for agreeing to let me defend my thesis in a very short notice.
Graduate classes that particularly helped me with the necessary foundation for my research were John Louie's ``Seismic Imaging'' and ``Geophysical Case Histories'' , Jim Brune's ``Surface Waves'', John Anderson's ``Inverse Problems'', Steve Wesnousky's ``Seismotectonics'', Chuck Langston's (Penn State) ``Seismology'' and Kevin Furlong's (Penn State) ``Geodynamics''. John Louie's classes taught me quite a bit of programming in C as a bonus.
I have to particularly thank Satish Pullammanappallil for all the instructive discussions I had with him at different times, and for helping me with his optimization code, and for being a nice friend. I appreciate the help I enjoyed at different times from Arturo Aburto, Rasool Anooshehpoor, Glenn Biasi, Sergio Chavez-Perez, Diane dePolo, Bill Honjas, Li Li, Serdar Ozalaybey, Baoping Shi, Gordon Shields and Ken Smith. I would thank Gayle Wilson especially for being very helpful whenever I needed it badly.
My life in Reno was never boring because of the intimate friend Dr. Eltagh Mirghani whose personality always impressed me and drew me even closer to him. I appreciate everything he did to make my and my family's life in Reno pleasant. Amr Yassin, Mehmet Kanoglu and Muniruzzaman Khan deserve special thanks for lending me their cars whenever I needed for them at late night. I also thank Aboudou Akambi for being a nice friend and giving me an opportunity to practice my French. My warm thanks also go for Humayun Shamim and Jamaluddin for being very helpful friends.
Outside Reno, Shameem and Sabina Siddiqui, Mahbub Ahmed and Nurul Kabir came forward with unselfish helping hands at different times.
Lastly, my wife Tahmina (Dolly) always tried to understand and appreciate the problems a graduate student has to live with. Her unfailing and soothing support kept me going on at times when I was drooping. I greatly apologize to her and my son Afif for the hardship my student life meant to them for years.
Vis-a-vis the dilemma of the linearized methods' being closely dependent on a priori information and the fully nonlinear methods' being oftentimes computationally intractable, I adopt a compromise between these two using different combinations of both as the problem at hand warrants. I use a fully linearized least squares method of travel time inversion for hypocenters and local three-dimensional velocity model in the Eureka Valley area, a conjugate-gradient based method for mapping the Pg refractor at a regional scale and a fully nonlinear method of optimization by simulated annealing for obtaining a starting velocity model in an area where no prior model exists.
In the first part of this dissertation, I use a tomographic scheme in order to invert about 200,000 Pg travel times for an upper crustal refractor velocity mapping in the western Great Basin. The results feature a north-northwest trending upper crustal low velocity corridor that connects the eastern California shear zone with the Furnace Creek-Fish Lake Valley fault. A positive correlation between Pg velocity and gravity along the dominant structural trend of the western Great Basin (north-northwest) has also been inferred.
The second part of the dissertation proposes an inversion methodology composed of an in-tandem combination of nonlinear simulated annealing optimization and linearized inversion in order to solve the problem of determination of hypocenter locations and a 3-d velocity model of the Eureka Valley area. The velocity model derived adequately describes broad structural features such as a sedimentary basin at the local scale. The hypocenter locations, however, are a little scattered.
Finally, I do elaborate two-dimensional synthetic tests to determine an efficient definition of the critical temperature to be used in the cooling schedule of simulated annealing optimization for determination of source locations and velocities. I define the critical temperature on the triple criteria of change of shape of the error-versus-iteration curve for short runs at different temperatures, average least square error as a function of temperature, and number of accepted models as a function of temperature. I also propose a heuristic combination of the conventional ``acceptance'' criterion and a ``rejection'' criterion for probabilistic acceptance of models. Tests for separate velocity model and source location determinations proved promising.
Dedication
Acknowledgments
Abstract
1. General Introduction
1.1 Background
1.2 Organization
2. Upper crustal velocity structure of the western Great Basin by Pg
tomography of earthquake travel time data
2.1 Introduction
Previous works
Geology
2.2 Method
2.3 Data compilation and analysis
2.4 Results and discussion
Error estimation
Resolution tests
Central part of the model
Long Valley caldera
Northern part of the model
Comparison with gravity and topography
2.5 Conclusions
3. Travel time inversion for earthquake locations and three-dimensional
velocity structure in the Eureka Valley area, eastern california
3.1 Introduction
Geology
3.2 Data
3.3 Method
Nonlinear optimization
Minimum 1-D model estimation by VELEST
Linearized 3-D inversion
Implementation
3.4 Results and discussion
Hypocenter locations
Velocity model
3.5 Conclusions
4. On critical temperature, perturbation scheme, and probabilistic acceptance
of models in simulated annealing optimization for velocity and source
location estimation
4.1 Introduction
4.2 Synthetic model and data
4.3 Perturbation schemes
4.4 Determination of the critical temperature
Velocity model perturbation only
Source location perturbation only
Simultaneous perturbation of the velocity model
and source locations
4.5 Probabilistic acceptance of models
4.6 Results
Velocity model determination only
Source location determination only
Simultaneous determination of the velocity model
and source locations
4.7 Conclusions
5. Conclusions and recommendations
5.1 General conclusions
5.2 Recommendations
References
Appendix A: Nature of Pg waves
Use of upper crustal velocity models for determination of sedimentary basin geometry is a standard interpretation tool, along with, or independent of, gravity modeling. A seismic velocity model can efficiently demarcate sedimentary basins from the underlying basement (Hauksson and Haase, 1997; Pullammanappallil and Louie, 1997; Eberhart-Phillips, 1990). Sedimentary rocks, with the exception of salt and evaporites, have significantly lower compressional and shear wave velocities than the igneous and metamorphic rocks they overlie. A two-dimensional velocity section derived for the upper 10 km of the crust can reveal the outline of the base of many sedimentary basins. A three-dimensional velocity model can picture the overall geometry of the basin floor.
One of the most important uses of seismic velocity models in geophysics is in the rather difficult task of relating measurements in the time domain to the subsurface geology in the space domain. A seismic velocity model occupies a pivotal place in the time-to-depth conversion of time domain data. In seismic data processing, the most sophisticated tools of time series analysis and migration techniques cannot compensate for an incorrect velocity model. Placement of impedance discontinuities, whether they are layer boundaries, faults or unconformities, in their true spatial positions depends critically on the velocity model used for migration (Claerbout, 1985; Yilmaz, 1987).
Any earthquake location program incorporates a seismic velocity model representative of the area. The more accurately the velocity model describes the subsurface, the more precise the earthquake locations are expected to be; and precise locations of hypocenters are essential for proper tectonic interpretation of the earthquakes (Nelson and Vidale, 1990; Eberhart-Phillips, 1990).
Velocity modeling has also helped locate magma chambers and estimate their size and fraction of melt within them (Iyer and Dawson, 1993; Toomey and Foulger, 1989). Vp / Vs ratio can be very helpful in characterizing temperature conditions in these cases.
Mapping of a more or less widely present refractor (e.g., upper crustal Pg refractor) velocity can put major crustal faults into feature by the velocity contrast across the fault. Hearn and Clayton (1986) image the trend of the San Andreas fault from a Pg velocity model obtained by travel time tomography.
Methods of tomographic velocity modeling in practice vary widely depending on the nature of the application they are designed for, the computational facilities available, extent of linearization etc. Data type (e.g., travel time, waveform, body wave, surface wave, first arrival etc.) and scale of the model (e.g., local, regional, global) are two very important factors that determine which method to adopt for velocity modeling. Presence or absence of significant prior information also plays a role. In areas where previous works have revealed at least some gross features of the model, a linearized method can very easily be justified. But in virgin areas, from a seismic velocity modeling point of view, the absence of utilizable and dependable prior information understandably warrant as little linearization as is absolutely needed by some other factors, such as limitation of computational facilities.
The dilemma between the inadequacy of the prevalent linearized methods in preserving the nonlinear nature of the problem of travel time inversion for velocities
on one hand, and the excessive computational effort demanded by fully nonlinear
methods on the other motivated me to search for a compromise. In this dissertation, I use a fully linearized (small perturbations with respect to an initial model) least squares method (Thurber, 1983; Eberhart-Phillips, 1990)
of travel time inversion for hypocenters and local three-dimensional velocity model, a conjugate-gradient based method for mapping the Pg refractor at
a regional scale (Hearn and Ni, 1994; Nolet and Sneider, 1990; Paige and Saunders, 1982);
and a fully nonlinear method of optimization by simulated annealing (Pullammanappallil, 1994) for obtaining a starting velocity model in an area where no prior
model exists.
1.2 Organization
Chapters 2, 3 and 4 describe works done for the tomographic estimation of velocity structure and
source locations using different combinations of linearized and nonlinear
methods.
In Chapter 2, I map an upper upper crustal refraction velocity of the western Great Basin using the so-called Pg first arrival times. This Pg refractor velocity mapping is done with a view to bridging a gap in the central part of the western Great Basin in terms of velocity modeling work, compared to the quite extensive modeling works existing in the northern (Eaton, 1963; Prodehl, 1979; Allmendinger et al., 1987; Benz et al., 1990; Holbrook, 1990; Catchings and Mooney, 1991) and southern (Gibbs and Roller, 1966; Geist and Brocher, 1987; Serpa and de Voogd, 1987; Wernicke et al., 1996; Louie et al., 1997; Chavez-Perez et al., 1998) thirds of the region.
I use the tomographic scheme of Hearn and Ni (1994) in order to invert about 200,000 travel times from 15000 earthquakes occurring in the area and recorded by two regional seismic networks. Earthquakes contributing to the data set come from widely different areas such as the topographically elevated Long Valley Caldera-Mammoth Lakes, about one km lower southern Nevada, Owens Valley, and Mojave Desert. I made an elaborate selection and analysis of the preliminary data in order to discern any effect of the source areas. Eventually I treat all the selected Pg travel times in a combined tomographic scheme and obtain Pg velocity model for this region along with station delays.
My results accompany detailed error estimation using the resampling technique of ``bootstrapping,'' and resolution analysis using sinusoidal checkerboard tests. I also compare three Pg velocity profiles against profiles of Bouguer gravity and topography to find a negative correlation between Pg velocity and Basin and Range topography, and a positive correlation between Pg velocity and gravity along the dominant structural trend of the area (north-northwest). A more or less continuous curvilinear trend of low velocities appears to connect the eastern California shear zone with the Furnace Creek-Fish Lake Valley fault zones.
In Chapter 3, I propose an inversion methodology composed of an in-tandem combination of nonlinear simulated annealing optimization and linearized inversion in order to solve the problem of determination of hypocenter locations and a 3-d velocity model of the Eureka Valley area. This area has little prior information on velocity except for the regional one-dimensional model used by the University of Nevada, Reno Seismological Laboratory for routine preliminary location of earthquakes. I use the simulated annealing scheme of Pullammanappallil and Louie (1993), and the three-dimensional linearized inversion scheme of Thurber (1983), improved by Eberhart-Phillips (1990). The nonlinear optimization finds a realistic starting velocity model that is further fine-tuned in the linearized inversion that follows it. I use about 7000 P and S arrival times recorded at 8 portable and 20 permanent stations from the Eureka Valley earthquake sequence of May-June, 1993. The velocity model derived in Chapter 3 adequately describes broad structural features such as a sedimentary basin at the local scale. The hypocenter locations, however, are a little scattered.
Chapter 4 deals exclusively with two-dimensional synthetic tests to determine an efficient definition of the critical temperature to be used in the cooling schedule of simulated annealing optimization for determination of source locations and velocities. While previous workers define the critical temperature on the sole criterion of lowest average least square error (Pullammanappallil, 1994) or highest correlation (Basu and Frazer, 1990) in ``short runs,'' I define it on the triple criteria of change of shape of the error-versus-iteration curve for short runs at different temperatures, average least square error as a function of temperature, and number of accepted models as a function of temperature. I show that this definition can handle some ``pathological'' computational situations better than the other definition. I also propose a heuristic combination of the conventional ``acceptance'' criterion and a ``rejection'' criterion for probabilistic acceptance of models. This results in a substantially smaller number of probability computations and of models to be saved. Tests for separate velocity model and source location determinations proved promising.
The northern fringes of the Western Great Basin at around 40° latitude have been the subject of investigation from as early as the King survey of the nineteenth century. The first significant seismic study conducted in the recent past in the northwestern Basin and Range is the Fallon-Eureka refraction profile recorded by the U.S Geological Survey in the 1960s. Eaton (1963) and Prodehl (1979) interpreted this profile following different methodologies and obtaining significantly different results. Priestley and Brune (1978) conducted a surface- wave dispersion analysis in western Utah and Nevada. Priestley et al. (1982) interpreted Pn delay times in terms of varying crustal thickness of the western Great Basin. More recently, the Consortium for Continental Reflection Profiling (COCORP) (Knuepffer et al., 1987; Allmendinger et al., 1987), and the Program for Array Seismic Studies of the Continental Lithosphere (PASSCAL) (Benz et al., 1990; Holbrook, 1990; Catchings and Mooney, 1991) carried out seismic reflection and refraction studies to determine the crustal structure of the Basin and Range Province across one of the biggest areas of Cenozoic extension, roughly along 40° latitude.
The southern boundary area of the WGB comprising such important regions as Death Valley has also had quite a few seismic imaging and velocity modeling studies. (Gibbs and Roller, 1966; Geist and Brocher, 1987; Serpa and de Voogd, 1988; Wernicke et al., 1996; Louie et al., 1997; Chavez-Perez et al., 1998) Almost all the velocity models coming out of the above-mentioned studies give only two-dimensional sectional views of the crust. A roughly north-northwest to south-southeast rectangular region more than 300 km long lies between the two well-studied areas.
Inside this rectangle, the Long Valley
Caldera has drawn unique attention from many researchers, and has been fairly
well investigated by local earthquake travel time inversion (Romero et al., 1993),
seismic reflection and refraction surveys (Hill et al., 1985; Black et al., 1991),
and teleseismic studies (Dawson et al., 1990; Steck and Prothero, Jr., 1994; Weiland et al., 1995).
My study area also includes the Yucca mountain region, which has drawn considerable
interest lately due to a proposed nuclear waste repository.
Apart from these local studies, a few regional lithospheric-scale velocity works
have also been performed.
Ozalaybey et al. (1996) derived one-dimensional shear-velocity structure at
nine broadband seismic stations in northern Nevada from combined analysis of
receiver function and surface-wave phase velocity data.
Hearn et al. (1991, 1994) studied the crustal and upper mantle velocity structure
of the western United States by Pn travel time tomography. Jones et al (1992)
describe the heterogeneity of the lithosphere in the Basin and Range Province
by combining geological, geochemical and geophysical data.
Geology
Fenneman (1931) defined the northern subprovince of the Basin and Range province
as the Great Basin. My area of investigation is the western part of this
subprovince referred to as Western Great Basin and includes, but is not limited
to, the tectonic province of the Walker Lane Belt defined by Stewart (1988).
The Walker Lane Belt is characterized by mountain ranges of unusual shapes or
trends lying between the Sierra Nevada on the west and areas of long
north-northwest-trending basin-range topography on the east (Stewart, 1988).
Some of its important characteristics are strike-slip faulting,
areas of large-scale extension, detachment faults and metamorphic
core complexes.
The belt is a northwest-trending zone about
700 km long and 100-300 km wide and is situated between the ``bulk'' of the
Basin and Range structures to the east and the Sierra Nevada to the west.
The Walker Lane Belt receives approximately 22% of the Pacific-North American plate motion from the San Andreas system via the eastern California shear zone (Unruh et al., 1996).
The southern limit of my intended study is marked by the Garlock fault.
Because I am combining data sets from the Western Great Basin Seismic Network
(WGBSN) and Southern Great Basin Seismic Network (SGBSN) administered by the
University of Nevada, Reno Seismological Laboratory (UNRSL) with a data set from
the Southern California Seismic Network (SCSN) administered by the California
Institute of Technology (CIT) Seismological Laboratory, I actually define my
model from 34° to 41° north latitudes and from 114.5°
to 121° west longitudes (Figure 2.1). This extends the area to be investigated to
the Mojave Deserts in the south and beyond the San Andreas fault in the west.
The inclusion of the CIT dataset is crucial for getting better ray coverage of the
Death Valley region.
Most of the geologic structures that I am going to concentrate upon are characterized by the Basin and Range features. Some geophysical traits of the region are widespread seismicity, thin crust, low upper mantle P-wave velocity, high heat flow, large scale crustal extension, and long-lived episode of magmatism (Eaton, 1982). The Sierra Nevada in the western part of my model is a relatively stable uplifted batholithic block of Mesozoic age. The eastern escarpment of the Sierra Nevada marks the western margin of the Basin and Range and is characterized by a series of Quaternary to Recent volcanic centers of which Long Valley Caldera is a well-studied one. The extended structures of the greater Death Valley region lie in the south central part of my model area and constitute one of the areas of most active Cenozoic extension in the Basin and Range Province.
The average elevation in the northern half of the area of investigation is about
1.5 km, and that in the southern half is about 0.7 km, with an abrupt step at
around 37° latitude (Saltus and Thompson, 1995).
2.2 Method
I use Pg travel time tomography to map the putative Pg refractor velocity in
the western Great Basin. The nature of Pg wave is somewhat elusive and poses a
problem in parameterization needed for any tomography. As is discussed in Appendix I, researchers are split in their opinion about whether Pg is a turning wave
or a head wave. Head-wave parameterization works
almost perfectly for Pn tomography, but not always so for Pg tomography.
For a single source-receiver pair one may get identical travel times
for an infinite set of velocity models starting from a two-layered (head wave)
to a continuously varying (turning wave) model (Aki and Richards, 1980).
But, for a high number of receivers well-distributed over a range of distances,
the non-uniqueness decreases, and, in such cases,
head wave type arrivals are expected to render the travel time
versus distance curve into a straight line whereas turning wave type arrivals
will make the curve concave towards the distance axis (Figure 2.2). I count on this criterion, and in view of the fairly linear trend of the Pg segment of
the travel time plot of my data (Figure 2.3), I adopt a head wave type parameterization for the tomography (Figure 2.2b).
I follow the tomographic method of Hearn and Ni (1994), which fits travel time perturbations from the simple linear travel time curve obtained during the prior data analysis and selection stage. Figure 2.2b depicts the schematic path of a refracted Pg wave. The path consists of three parts - i) the ray path from the source to the refractor (S1A1 or S2A2), ii) the ray path along the Pg refractor (A1B1, A1B2 or A2B2), iii) the ray path from the refractor to the receiver (B1R1 or B2R2). This three-part description of the ray path entails the necessity of distributing the travel time perturbations into source and receiver delays and delays due to the refractor velocity heterogeneity. Accordingly, I parameterize the source and receiver components by source and receiver static delays, and refractor velocity heterogeneity by dividing the refractor into a two-dimensional grid of cells. The cell size used is approximately 5 km by 5 km. The total number of cells is 18018.
I adapt the Pn tomographic scheme of Hearn and Ni (1994) to the case of Pg tomography. Following is a short description of the scheme. First a straight line is fit by regression through the Pg arrival times. The slope of the line gives the mean Pg slowness and the intercept gives the mean overburden time. These two quantities constitute the starting model of the tomographic scheme. The residual travel time tij with respect to this starting model for the Pg path between source i and receiver j can be expressed by the modified time term equation (Scheidegger and Willmore, 1957):
(1)
si is
the slowness perturbation of the cell i, and n the number of Pg refractor cells traversed by the ray.
Station and source static delays may be described by the equation:
(2a)
(2b)
The above expressions of source and station delays indicate that both of them represent the combined effect of the concerned thickness and slowness of the overburden (upper crust), and these latter cannot be independently determined from the delay times. But some rough estimates of relative variations in the two parameters can be obtained.
Because of the errors in the origin times and source depths, I restrict myself to the interpretation of the station delays only.
My starting model for the tomographic scheme consists of only two quantities, the mean Pg slowness and the mean overburden delay, which are obtained from the slope and intercept respectively of the best-fit straight line through the travel time versus epicentral distance plot.
Hearn and Ni (1994) developed their tomographic scheme using the preconditioned
version of the LSQR algorithm of Paige and Saunders (1982) and Nolet (1984).
This algorithm, which is in essence of the conjugate gradient type, is very
efficient in solving large systems of equations, suppressing the propagation
of data errors and in the rate of convergence. It iteratively finds the least-squares
solution to the matrix equation obtained by expressing all the observed
travel times as linear combinations of the station delays, source delays and
slowness perturbations as in equation (1). Preconditioning is applied through
scaling the source and station delays by the square root of number of originating
or arriving rays, and scaling the slowness perturbation by the quantity
m =
, where Di is the total ray path length
for ray i, di is the length of ray i within the cell, and n is the
number of rays crossing the cell (Hearn and Ni, 1994). The preconditioning assures a fast convergence
for the algorithm.
A regularization scheme minimizing the Laplacian of the slowness image (Lees and
Crosson, 1989) was applied to the solution in order to adjust for variations
in ray density and obtain a smoother image.
2.3 Data compilation and analysis
The tomographic study under discussion involved an elaborate data compilation,
selection and analysis process, which need to be described in order to make the
interpretation of results adequately meaningful.
The travel time data used in the study come from three sources. Travel times from
earthquakes occurring between February 1, 1991 and December 31, 1995 and recorded
by the network administered by the University of Nevada, Reno Seismological
Laboratory (UNRSL) form the first subset. Travel times from events occurring
between January 1, 1991 and December 31, 1993 and recorded by Southern California
Seismic Network (SCSN) of the California Institute of Technology (CIT) form the
second subset. The third subset consists of only about 1000 travel times from 48
explosions shot between 1984 and 1991 in the Nevada Test Site (NTS) and recorded
by a maximum of 55 stations in southern Nevada (obtained from Arthur Chen at Rensslaer Polytechnic Institute).
I prepared a composite data base of all the first P and S arrivals in a modified form of the International Seismological Centre (ISC) format from the above three sources. The data base resulted in a text file of 439224 lines of 21 columns. Each line describes all the relevant pieces of information pertaining to a phase arrival in a self-contained fashion. Some of the information contained in each line are station name, source and station coordinates, origin and arrival times, back-azimuth etc.
Of all the events used in the study, locations and origin times of only the 48 explosions at the NTS are known precisely, while those from the two networks are subject to some imprecision due to their having been determined by automatic event location programs used by UNRSL and CIT. I did not try to relocate these events, and assumed that there is no systematic bias in the origin times and locations. The high number of travel times used will statistically minimize the ``noise'' introduced in the data due to incorrect event coordinates in space and time. Of course, datum correction for all the event depths was imperative for the above assumption, as depth of an event mentioned in the original network catalogs is only with respect to the average elevation of stations used in locating the event and not, as would be expected, with respect to a surface of constant elevation like the mean sea level. I made a datum correction for each event depth from the average elevation of all the stations recording direct or Pg arrivals from that event so as to determine the depth with respect to the mean sea level.
Figure 2.3a shows the travel time versus epicentral distance plot in gray scale for most of the phases of the data base. After preliminary examination of different trends in the plot it was found that the Pg arrivals define a fairly straight line segment between epicentral distances of 20 and 130 km. Arrivals at less than 20 km distance define a roughly hyperbolic pattern and represent the direct arrivals, and those at more than 140 km distance represent the Pn arrivals. I selected phases following a general criterion of a minimum of 10 stations per event and 10 events per station. This criterion was relaxed if absolutely necessary, such as for areas with very few events or stations. Figure 2.1 shows the locations of events and stations thus selected. The selected data set consists of 180000 travel times from about 15000 earthquakes to around 340 stations with epicentral distances between 20 and 130 km in the Western Great Basin. A substantial portion of the data comes from the Mammoth Lakes (LVC), Little Skull Mountain (NTS) and Landers (MD) sequences.
The linear trend in the travel time versus epicentral distance plot suggests that most of the arrivals are of the head-wave type of refracted wave. The scatter of the data around the fitted straight line reflects the i) difference of source depths, ii) difference of overburden velocities, iii) lateral heterogeneity of the refractor velocity etc. For a certain source, depth variation and overburden velocity variation will show up as a constant shift of the straight line of fit. But the effect of refractor velocity heterogeneity will not be so simple to observe. It is primarily this heterogeneity that I am investigating. The overburden velocity variation will be considered only when absolutely warranted, such as for very large station residuals.
Figure 2.3b shows a gray scale plot of delay versus epicentral distance for maximum absolute delays of 1.0 s with respect to the best fitting velocity. This and the delay histogram of Figure 2.3c show that the delays are fairly symmetrically distributed with respect to zero. The average velocity and refractor depth obtained by least squares fitting of all the data with less than 1.0 s absolute residual are 6.09 km/s and 7 km respectively which I choose for all the cells of the starting model.
Though I eventually treat all the selected Pg travel times in a combined tomographic scheme, here I conducted a separate analysis of travel times for each of the following six earthquake source regions: a) Mammoth Lakes and Long Valley Caldera (LVC), b) northern Nevada, north of the latitude of Long Valley Caldera (NNEV), c) Eureka Valley (EV), d) Little Skull Mountain and Nevada Test Site (NTS), e) Owens Valley-Indian Wells (immediately north of the Garlock fault and east of Sierra Nevada) (OWV), and f) Landers (in the southern Mojave desert). Figure 2.4 shows the histogram of source depths (before datum correction) for each of the six regions. The striking difference between the representative source depths in the LVC (7-8 km) and NTS (9-11 km) areas is immediately noticeable. If one takes into account the fact that the average elevation of LVC area is more than 1 km higher than that of NTS area, the relative difference of the source depths after datum correction becomes even greater. Linear regression through Pg data points for the LVC and NTS areas give 6.01 km/s and 6.04 km/s respectively for the Pg velocity. The depths to the Pg refractor are 5.1 km and 7.2 km assuming an overburden velocity of 5.7 km/s, and, 6.2 km and 8.2 km assuming an overburden velocity of 5.85 km/s.
Figure 2.5 shows the ray coverage of the data used. The uneven density of rays
at different parts of the model is conspicuous.
2.4 Results and discussion
I ran 100 LSQR iterations of the tomographic inversion for the entire region and obtained the Pg velocity image of Figure 2.6. The inverted velocity varies from 5.83 km/s to 6.3 km/s. Figure 2.7 shows the station static delays obtained from the tomography. Figure 2.8 shows delays after the tomographic inversion as a function of epicentral distance (a) and, a delay histogram(b). Comparing with Figure 2.3, one can easily see the improvement in delays achieved. The station delays vary from -0.56 s to 0.82 s. It is essential to look at the ray coverage map (Figure 2.5) in order to properly interpret the velocity and station delay maps, because the reliability of any feature on these maps depends on ray coverage of the feature area. As a general observation, one sees the highest velocity (6.3 km/s) on the southwestern side of the San Andreas fault, which could be the reflection of high velocity oceanic crust of the Pacific plate. The high velocity zone defines the strike of the San Andreas fault quite well. Hearn and Clayton (1986) find a similar Pg velocity pattern in this area.
Returning to the
main part of my area of investigation, namely, the western Great Basin and adjacent
structures, one can observe the low velocity patches of about 5.85 km/s
in the Long Valley Caldera (LVC), Amargosa Valley (AV) and Owens Valley areas,
and high velocity (6.15 km/s) in the Sierra Nevada.
Southern Nevada near NTS is characterized by the overall average velocity of
6.1 km/s. The occurrence of very large positive delays (maximum 0.82 s) in this
area suggests that the velocity could actually be lower, and also, the refractor
could be deeper than 7 km. The eastern and northern fringes of the NTS show abnormally
low velocities (5.83 km/s) and fairly large negative station delays. This
area being part of a region of thicker and older crust (Ozalaybey et al., 1996) is
expected to be rather fast. The large negative delays seem to account for this.
Error estimation
Standard errors of the velocity and delay parameters are determined using the
resampling technique of ``bootstrapping'' (Efron and Gong, 1983; Efron and Tibshirani,
1991; Koch, 1992). The bootstrap method of error estimation uses an original
data set to generate new data sets of equal size by
samples with replacement from the original data set. A large number of iterations
of the tomographic inversion scheme is run for each bootstrap data set. By using
considerable number of bootstrap realizations (50) a set of bootstrap
solutions (slowness and station delays) are obtained which is then used to
estimate the velocity and station delay standard deviations. Figures 2.9 and 2.10 show the standard deviations
of the Pg velocities and station delays. The maximum deviation error is
0.04 km/s. The San Andreas fault and eastern Nevada areas, and part of the Sierra
Nevada show the highest velocity uncertainties. Most of the western Great Basin
straddling the California-Nevada state border shows velocity errors of about 0.01 km/s.
The station delay errors (Figure 2.10) demonstrate somewhat erratic behavior, some
uncertainties being greater than the respective delay magnitudes. Two such stations are
to the southeast of the Long Valley Caldera.
Three other stations with very large delay
error (0.3 s) are situated at locations where ray density just decreases to
zero (Figure 2.5) (south of the San Andreas fault, west of the Sierra Navada and Long Valley
caldera, and far northern Nevada).
Resolution tests
I use four sinusoidal checkerboard test models to determine
the resolution of the tomographic inversion in points of the model.
The best-fitting average velocity of 6.09 km/s (which was used as the initial
model in the tomographic inversion of the real data) corresponds to a slowness
of 0.1642 s/km. For the synthetic test, I made the slowness vary by an amplitude
0.0039 with half-wavelengths of 1/2, 1/4, 1/6 and 1/10 degrees
for the four models. This makes the minimum and maximum velocity of the input
checkerboard pattern 5.81 km/s and 6.39 km/s respectively. Therefore, a recovery
of this velocity range in the output models would be considered 100%. One can
easily see how the percentage of recovery varies with the half-wavelength used.
The same distribution of sources and stations as in the case of the real data was used to compute travel times through the synthetic model. Both source and station delays were assumed zero for this computation and no noise was added. Figures 2.11-2.18 show the results of the resolution tests. Figures 2.11, 2.13, 2.15 and 2.17 are the velocity images obtained for half-wavelengths of 1/2, 1/4, 1/6 and 1/10 degrees respectively. Figures 2.12, 2.14, 2.16 and 2.18 show the corresponding station delay maps.
The best-resolved area is evidently that of the Long Valley Caldera (LVC), where checkerboard pattern is fairly well-recovered for all the wavelengths even though the recovered amplitude decreases with wavelength. The amplitude recovery is about 75% for the wavelength of 0.25 degree (Figure 2.13). The southern Owens Valley and the Mojave desert areas both show significant recovery (> 50%) at this wavelength. Recovery in the NTS area is 30-40% only. The region along the California-Nevada state border between LVC and NTS areas show nearly the same type of recovery. The San Andreas fault region shows some recovery only at the 0.5 degree wavelength.
The ideal resolution test result for the station delays would be zero delays for
all the stations. One can see in Figures 2.12, 2.14, 2.16 and 2.18 that the test results
obtained are far from ideal. A significant number of stations, especially in areas
of low ray density show appreciable delays (-0.2 s to +0.18 s for 0.5 degree
wavelength). The delay amplitude for these stations decreases with decreasing
wavelength. But delay amplitude for stations in areas of good ray density show
a gradual increase with decreasing wavelength. This can be attributed to
the low wavelength velocity variation in the original synthetic model
that is not efficiently recovered in the output velocity models. There seems
to be increasing trade-off between velocity and station delays with decreasing
wavelength.
The high number of stations with nonzero delays in southeastern Nevada at all the
wavelengths can partly explain the paradox of obtaining the big low velocity
patch (Figure 2.6) in that region.
Central part of the model
In order to look at the velocity image in more detail I produced two expanded
displays of the model, one featuring the area between LVC and the Garlock fault
(Figure 2.19), and the other featuring the area north of LVC (Figure 2.20).
Figure 2.19 shows a general high velocity (6.15-6.2 km/s) in the Sierra Nevada. Of
course, a good portion of it lacks enough resolution. This velocity may be
representative of the little-altered batholithic granite. A big high velocity
patch of 80 km by 80 km dimension, again in a poorly-resolved area, straddles the Garlock fault at its eastern
part and thins southwards in eastern Mojave desert. Hearn and Clayton (1986)
found a Pg velocity of less than 6.0 km/s in the Mojave region. More
recently Magistrale et al. (1992) in their three-dimensional tomographic modeling
of the southern California crust, found layers of 6.07 km/s between 4 to 8 km,
and 6.24 km/s between 8 to 14 km depth. I obtained Pg velocities in the range
5.95 to 6.2 km/s in the Mojave area (Figures 2.6 and 2.19).
A trend of low velocity (5.9-5.95 km/s) zone (shown by the dashed curve in the Figure 2.6) stretches northwards from central Mojave, crosses the Garlock fault east of the Owens Valley fault zone into the Owens Valley, and then jumps eastward through the northern part of the greater Death Valley area and meets another low velocity trend along the Furnace Creek fault zone. This latter trend of average 5.95 km/s extends from the Amargosa Valley-eastern Death Valley region north-northwestwards along the Furnace Creek fault zone, crosses the Fish Lake Valley and terminates in the Excelsior-Coldale area. In this trend, the Amargosa Valley shows an unusually low velocity of 5.85 km/s. Velocity in the Fish Lake Valley also is about 5.90 km/s. Most of the length of the Furnace Creek fault zone lies within a valley separating high mountains (Stewart, 1988). The low velocity may be the reflection of the zone of weakness of the upper crust underlying this valley. The Fish Lake Valley fault zone is characterized by a high slip rate (> 4 mm/yr) being responsible for about half of the Pacific-North American plate -boundary shear accommodated within the Basin and Range Province at that latitude (Reheis and Sawyer, 1997).
The Amargosa Valley belongs to north-trending
structural trough termed Kawich-Greenwater Rift by Carr (1990). This rift includes
a series of major volcanic centers near the western edge of the Nevada Test Site.
The thick line passing through Amargosa Valley in Figure 2.19 (labeled KGRGL)
shows the axis of a gravity low defining the rift. A trend of relatively low
Pg velocity (6.0 km/s) extending along KGRGL from Amargosa Valley
through Crater Flat and Timber Mountain volcanic complex can be observed.
Here is a correlation of low velocity, low gravity and volcanic centers.
Long Valley Caldera
In the northwestern part of Figure 2.19 and southern part of Figure 2.20, the Long
Valley Caldera (LVC) features as an area of characteristically low velocity of 5.87 km/s.
The low velocity is apparently the expression of the residual magma chamber of
the caldera. There is some difference of opinion about the depth of the residual
chamber among researchers. Dawson et al. (1990), and Sanders (1993) mention 7.0 km as the starting depth. Steck and Prothero, Jr. (1994)
found a tabular low velocity feature between 7 and 11 km depth, whereas Weiland
et al. (1995) place the center of the chamber at 11.5 km depth. They found a
25-30% slower velocity at this depth than the surroundings. In my study, the
difference between the LVC velocity and its surroundings is not as abrupt (only
about 4%). This may be explained by an averaging effect in the velocity sampling
of the Pg rays. The dimension of the LVC Pg low velocity
feature is about 20 km by 20 km.
Northern part of the model
The ray coverage and resolution obtained in the area north of the LVC in Figure
2.20 are mostly below average. One can observe the northward trend of slightly-
lower-than-average velocity in the Dixie Valley area with a little difficulty.
Whether this an artifact of the ray distribution or a reflection of the crustal
extension and geothermal gradient is hard to ascertain. The cross signs before
MNA and KVN denote the location of two braodband stations of those
names beneath which Ozalaybey et al. (1996) determined shear-wave velocity structure using a
combined analysis of receiver function and surface-wave phase velocity data.
Their S velocity model at MNA increases quickly (above 5 km depth) to 3.55 km/s and then
maintains roughly the same value all through the crust down to 32 km depth except for a low velocity layer at 7-8.5 km. The S velocity
at KVN increases more gently and reaches the 3.5 km/s value at 7 km and then
roughly linearly increases to 3.65 km/s over the next 25 km of the crust except for a low
velocity zone between 10 and 14 km depth. Assuming that the P wave velocity
model mimics the shape of the shear-wave velocity model excepting the velocity
inversions, one expects the MNA area to have higher P wave velocity relative to
KVN area in the depth range sampled by Pg wave. For a Vp/Vs
ratio of 1.732 this means a relative difference of 0.10 km/s in the P wave
velocity. My Pg velocity map (Figure 2.20)
shows this roughly the same relative difference.
The seismic profiles shot by COCORP (Allmendinger et al., 1987) and PASSCAL (Catchings and Mooney, 1991) are located around the latitude of 40°. The only profile along which there is any ray coverage at all in my study is the northeast-southwest transect of PASSCAL. The thick white dotted line in the northeastern quadrant of Figure 2.20 shows the location of the southernmost 90 km of this line passing through the Dixie Valley. Benz et al. (1990) determined the one-dimensional velocity model from Pg travel times among others and found a Pg velocity jump from 5.5 to 5.8 km/s at 60 km distance, and then a gradual increase to 6.1 km/s. Holbrook finds an average Pg velocity of 6.0 km/s at distances greater than 50 km for this profile. Pg velocities obtained in the this part of the present study lie between 6.0 and 6.05 km/s.
The high velocity (6.12 km/s) patch observed west of Reno (in Truckee) could
be explained by the existence of batholithic granites in the area.
Comparison with gravity and topography
Figures 2.21, 2.22, and 2.23 show profiles of Pg velocity, ray density, elevation and Bouguer gravity along two north-south lines and one east-west line. (see Figure 2.6 for
location of these lines). The north-south AA' line was chosen to pass through
areas of maximum possible ray coverage. As this line does not cross the important
structure of Long Valley Caldera, the second north-south line AA'' was also chosen.
The location of BB' was chosen to be perpendicular to as many structural trends
as possible. Inclusion of the ray density profile in Figures 2.21, 2.22, and 2.23 was intended
to facilitate the comparison of the Pg velocities with gravity and
elevation wherever the velocity result is supported by a sufficient ray coverage.
At long wavelengths, both profiles AA' and AA'' demonstrate that Pg velocities
and elevations have an inverse relationship, i.e., low velocities correspond
to high elevations and vice versa. Velocities and Bouguer gravities vary in the
same sense. Correlation between velocity and gravity can be observed at even
lower wavelengths than those between velocity and elevation.
Comparing the different profiles along BB' is not as straightforward partly because of significant lack of ray density at several locations.
There is positive correlation between Pg velocity and gravity at some individual
highs and lows. Figure 2.24 shows elevation versus Pg velocity and Bouguer gravity versus Pg velocity scatter diagrams for all cells in the model
having ray hit counts of 10 or more. These diagrams were done in order to discern
any overall correlation between Pg velocity and elevation and gravity.
There is no apparent trend in either of the two plots. Perhaps a more restrictive
choice of cells (other than basing only on ray density) may feature the relationship
observed in the north-south profiles.
2.5 Conclusions
I performed tomographic inversion for crustal refractor velocity in the western Great
Basin using the so-called Pg arrival times from several thousand
earthquakes. This work is the first Pg refractor velocity mapping done
in the western Great Basin using travel time tomography.
The results have been presented as a Pg velocity map for all the 5 km by 5 km cells of the model.
A by-product of the inversion was a station delay map which
gives some idea about major variation of the velocity that avoided being put
into the velocity map, or any significant deviation from the constant refractor
depth assumed.
At the individual structural level the Long Valley Caldera is characterized by a low Pg velocity of 5.87 km/s, Amargosa Valley by 5.85 km/s, and the Sierra Nevada by a high velocity of 6.15 km/s. At a more regional scale, low velocity trends are found to lie along areas of high slip rate such as Furnace Creek-Fish Lake Valley fault zones, volcanic centers such as the Kawich-Greenwater Rift, or crustal extension such as Death Valley and Amargosa Valley. The more or less continuous trend of low velocities running from the western Mojave Desert region northwards across the Garlock fault into the Owens Valley, and then bending eastwards across the Death Valley extended region to the Furnace Creek fault zone and finally terminating north of the Fish Lake Valley is a significant result of this study. This trend may play a role in the transfer of deformation from the Mojave block (Eastern California shear zone) to the Walker Lane Belt. (Unruh et al., 1996).
A positive correlation between Pg velocity and Bouguer gravity at individual structural and longer wavelength scales and a negative correlation between Pg velocity and elevation at long wavelengths are observed.
The motivation behind resorting to the inversion scheme to be
dealt with in this chapter comes from the absence of even a one-dimensional local
velocity model of the area, and also of enough geologic information to make
one. Eberhart-Phillips (1993) talks about two possible approaches for setting
up an inversion. The first approach is to start with a velocity model based on
previous geologic interpretations and try to fit the data with the model by
perturbing the latter as warranted by the data. The second approach is to start
with a simple 1-D model obtained with the same data and let the inversion
solution add complexity wherever needed by the data. Both the approaches have
the common implicit assumption of the starting model being close to the true
model as required by any linear or linearized inversion process.
Eberhart-Phillips (1990, 1993) prefers using a simple minimum 1-D model derived
by the program VELEST (Ellsworth 1977; Kissling 1988; Ellsworth et al., 1990)
from the same data set as would be used for 3-D inversion. Kissling et al. (1994)
prescribe the minimum 1-D model as the ideal initial reference model for local
earthquake tomography in order to minimize the artifacts caused by an
inappropriate initial model. But the ``recipe'' to calculate the said minimum
1-D models includes the establishment of an a prioi 1-D model from information
regarding the stratification of the area as the first step (Kissling et al., 1994).
But in a hitherto little
studied place like Eureka Valley none of the above approaches seems adequate simply because of the absence of reliable a priori geological and geophysical information. Hence,
the necessity of a method that does not depend on the initial model.
Two series of methods satisfy this condition. They are iterative strategies
requiring matrix manipulations (Spakman, 1993) and global optimization
schemes such as simulated annealing (Rothman, 1985, 1986; Sen and Stoffa, 1991) which do not require any matrix manipulation.
The sparse station distribution in the Eureka Valley area precludes
the effective use of any iterative strategy. Optimization by simulated annealing
is appealing in this special case because of its independence of the initial
model and ability to produce solutions with very sparse data. So my method of
choice for investigating structure and aftershock locations builds on a first
step of initial 3-D model estimation using nonlinear optimization by simulated
annealing (Pullammanappallil and Louie, 1993), followed by a fine-tuning step
using 3-D linearized inversion (Thurber, 1983; Eberhart-Phillips, 1990). I also
run a 3-D linearized inversion using a minimum 1-D model calculated by
VELEST and compare the two results.
Geology
Eureka Valley is situated in the southwestern
Great Basin, roughly midway between Death Valley and Owens
Valley (Figure 3.1). Structurally, it is a north-northwest trending valley
about 30 to 40 km long, with possibly shallow sediment depths
as reflected by moderate (< 10 mGal) gravity anomalies. Its topography
conforms to the regional trend, which is extremely rough but gradually rising
on average westward to the Sierra Nevada. From a seismotectonic point of
view, it belongs to the central portion of the Inyo-Mono section of the
wide Walker Lane belt defined by Stewart (1988) and characterized by a
complex system of aligned or subparallel dip-slip and strike-slip faults.
The Inyo-Mono Section is situated between the Furnace Creek
fault zone on the east and the Sierra Nevada on the west. Most of the big
mapped faults in this section strike north-northwest, but many smaller
mapped faults, including those in the Eureka Valley, trend slightly east of
north. The valley fill is mostly Quaternary eolian sand deposits and
alluvia flanked on the west by Pliocene volcanics, Mesozoic granite and
mostly Cambrian metasedimentary outcrops, and on the east by Carboniferous
and Cambrian metasedimentary rocks (Oliver and Robbins, 1978). Berkstresser, Jr.
(1974) reports the occurrence in the southern part of Eureka Valley of the tallest sand dune in California rising 208 m above the playa on which it stands.
3.2 Data
More than 500 aftershocks followed the M6.1 Eureka Valley earthquake of
May 17, 1993. About 150 permanent stations belonging to the Southern Great
Basin Seismic Network (SGBSN), the Western Great Basin Seismic Network
(WGBSN), the Southern California Seismic Network (SCSN), and the Northern
California Seismic Network (NCSN) recorded the main shock. Five portable
stations deployed by the Seismological Laboratory of the University of
Nevada, Reno (UNR) and three deployed by the California Institute of Technology
(Caltech) seismological
laboratory recorded hundreds of aftershocks in a span of 20 days,
of which
about 500 were coincident with events catalogued by UNR's permanent
networks.
Figure 3.1 shows the locations of permanent and temporary stations, along with preliminary UNR catalog locations of the events. At first glance it becomes evident that the spatial distribution of the permanent stations around the epicentral area offers only limited azimuthal coverage. To locate the aftershocks and find the local velocity structure of the epicentral area I used P and S arrivals at 20 permanent stations timed by the UNR Seismological Laboratory and at 8 portable stations that I timed myself.
The ratio between the number of S and P observations is several times
greater for the portable stations than for the relatively more distant
permanent stations.
Figure 3.2 shows a time-distance plot for P and S arrivals from 430 events at
portable and permanent stations at maximum 100 km epicentral distance,
based on the preliminary estimates of
origin times and hypocentral coordinates obtained at the UNR Seismological
Laboratory from a one-dimensional regional velocity model and the FASTHYPO
program (written by Robert Hermann of Saint Louis University). The figure
excludes obvious outliers by keeping only those picks that lie within a
window centered around the average straight line fit to the cloud of data
points. The arrival/travel times have weights of 0 to 4 assigned depending on
quality of pick, 0 being the best and 4 being the least reliable.
3.3 Method
My approach is a combination of nonlinear optimization and linearized inversion
(Asad et al., 1994) comprising the following stages:
1) Obtaining an initial three-dimensional velocity model using optimization by simulated annealing (Pullammanappallil and Louie, 1993).
2) Simultaneous relocation of hypocenters and determination of local velocity structure using linearized inversion (Thurber, 1983; Eberhart-Phillips, 1990).
For comparison, I also run an exclusively linearized inversion, where the stage 1 in the above approach is replaced by a step of finding the minimum 1-D model using VELEST (Kissling et al., 1994). A brief description of the three techniques involved in the two parallel approaches follows.
Nonlinear optimizationd
My nonlinear approach for simultaneous determination of hypocenters and
velocity structure is an outgrowth of the simulated annealing optimization
scheme of Pullammanappallil and Louie (1993,1994) and Pullammanappallil (1994).
This scheme in principle does not require any a priori
assumptions, and the results obtained by it are independent of any initial
model. For the forward problem of travel time computation, the scheme
avoids actual ray tracing and uses a fast finite-difference scheme based
on a solution to the eikonal equation (Vidale, 1990; Hole et al., 1992),
accounting for curved rays through arbitrarily variable velocities, and all
types of primary arrivals. S arrivals were used only to increase the number of
travel times assuming a constant Vp/Vs ratio of the square root of three, and no inversion for
S velocity model was done. Random perturbation of the volume and velocity of
a subset of the whole model is done from iteration to
iteration. Velocity within this subset is constant for each iteration.
The perturbed volume can vary from a single cell to the whole model,
while the velocity is allowed to vary from 1.5 km/s to 8 km/s. Random perturbation of
all the three coordinates of a randomly chosen hypocenter is done at each
iteration. The horizontal hypocentral coordinates and the depth are allowed to
vary by a maximum of 5 km and 14 km respectively between two iterations.
I use an iterative Monte-Carlo based optimization process to
solve the inverse problem.
Each iteration comprises the following
steps: 1) travel time computation through the current model and
determination of the least-square error; 2) simultaneous random
perturbation of origin time, hypocenter location, and velocity
structure, and computation of new least square error;
3) acceptance of the new model if the corresponding new error
is less than that of the previous iteration, plus a provision for
probability-based conditional acceptance of new models
if the new error is larger;
4) repetition of the previous steps until the annealing converges,
where the difference in the least square error between successive models
and probability of accepting new models become very small.
An important component of this study is the random forcing of hypocenter
location perturbations at each iteration.
As I will show below, this forcing appears to allow the optimization of
a reliable 3D-velocity model while relegating unacceptable errors in the
final, perturbed hypocenter coordinates.
Minimum 1-D model estimation by VELEST
Kissling et al. (1994) defines the minimum 1-D model as the 1-D velocity model that itself represents the least squares solution to the coupled hypocenter velocity model relation equation. In this model the layer velocities are approximately equal to the average velocity in the 3-D tomographic solution within the same depth range. The model is determined by a trial and error process starting with available a priori information about the subsurface structure.
Linearized 3-D inversion
In the linearized approach, I use the program SIMULPS originally developed by Thurber (1983) using
approximate ray tracing (ART) and pseudo-bending (PB) algorithms and further
improved by Eberhart-Phillips (1990),
for forward modeling of P- and S-wave arrival
times in an iterative, damped, least-squares inversion for hypocenters and
three-dimensional velocity structure. The method also employs the parameter
separation technique (Pavlis and Booker, 1980; Spencer and Gubbins, 1980) which
allows to computationally split the problem into two, e.g., solving for the hypocenters and velocity model separately, while maintaining the mathematically coupled
nature of the overall problem.
Evans et. al (1994) describe in detail the technique and all the parameters involved. Hauksson and Haase's (1997) study of the
three-dimensional velocity models of Southern California is one of the latest works done using SIMULPS.
Model parameterization in this method assumes a continuous velocity field. The earth structure is represented in three dimensions by velocities defined at discrete points, and velocity at any intervening point is determined by linear interpolation among the surrounding eight grid points. Values at the velocity nodes are systematically perturbed during inversion. Except the outermost nodes, which are always fixed, every node can either be kept fixed or included in the inversion. Derivative weight sum (DWS) is a useful measure of ray density in the neighborhood of a node (Toomey and Foulger, 1989) and is used to determine which of the nodes should be included in the inversion. DWS for a node is similar to the ray hit count, but weighted by the ray-node separation and raypath length in the vicinity of the node.
The ART algorithm selects the path with the least travel time from a suite of circular arcs connecting the source and receiver. The iterative PB method fine tunes the ray path obtained by ART to approximate better the true ray path dictated by local velocity gradients (Um and Thurber, 1987). I adopt Eberhart-Phillips' (1990) method of adjusting an initial ray path rather than Um and Thurber's (1987) method of calculating a new midpoint for each segment because the former allows much easier updating of the raypath each time a source-receiver path has to be calculated.
A circular arc is a poor approximation at large distances and overestimates
the path length, because a refracted wave is likely to be the first
arrival at such distances. For an area like Eureka Valley, where alluvium is
apparently underlain by rigid rocks, refracted waves can start arriving at
even 10 km distance. Because of this, and the fact that timing error usually
increases with distance, I chose a distance-weighting scheme of
weight 1 for source-receiver distances of up to 30 km, and 0 for 80 km and
beyond, with a linear tapering in between. The iterative pseudo-bending
scheme (Eberhart-Phillips, 1990) reduces the difference between exact and ART
raypaths for large source-receiver distances.
Implementation
I use P and S arrival times recorded at the UNR and Caltech portable stations exclusively to
perform nonlinear optimization for velocity and hypocenters, and add more picks from nearby
permanent stations to perform linearized inversion
in and near Eureka Valley. I timed a total of 1492 P wave arrivals
and 768 S wave arrivals at 8 portable stations from 429 events (Figure 3.2, lower
plot) recorded these stations.
The maximum number of arrival time picks per event is 16,
automatically limiting the number of events that could be effectively
studied, because the linearized inversion required a minimum of four picks per
event for location only. For simultaneous relocation and velocity estimation,
the required minimum number of data points increases by the number of velocity
model parameters.
Therefore, I add distance-weighted arrival times from permanent stations
up to 80 km epicentral distance increasing the number of observations to 6989
P and 1879 S arrival times from 430 events (including the main shock) recorded at 28 stations.
The nonlinear optimization has no theoretical requirement for any minimum number of arrival times.
Velocity model parameterizations for the linear inversion and nonlinear optimization schemes somewhat differed from each other. In the nonlinear optimization, I used a 35 (x) by 35 (y) by 20 (z) grid of 1 km blocks. The linearized inversion used a 13 (x) by 15 (y) by 9 (z) grid of nodes with a node spacing of 2 km at the center and near surface portion of the model and increasing outwards and in depth. A minimum DWS of 7 is used as a condition for any node to be included in the inversion. I plot all my results in the central 30 by 30 by 20 km volume by linearly interpolating wherever necessary.
I relocated the hypocenters of the 430 events and modeled the velocity structure simultaneously using four different processes (see Figure 3.3):
1) Simultaneous relocation and minimum one-dimensional velocity model estimation using VELEST.
2) Simultaneous relocation and three-dimensional velocity estimation by linearized inversion using the minimum 1-D model obtained in process 1 as the starting model.
3) Simultaneous relocation and three-dimensional velocity estimation by nonlinear optimization by simulated annealing.
4) Simultaneous relocation and three-dimensional velocity estimation by linearized inversion using initial locations from the UNR catalog and the three-dimensional velocity model obtained from process 3 as the starting model.
In view of the widely observed phenomenon that earthquakes in the Basin and Range Province occur almost exclusively in the upper 15 km of the crust (Eaton, 1982; Vetter and Ryall, 1983; dePolo et al., 1992), I used the constraint
of maximum hypocentral depth of 20 km for all four of the above processes.
3.4 Results and discussion
I present the relocated hypocenters and velocity models in the following two sections. As one can see, the locations in the two methods, i.e., process 1 followed by process 2, and process 3 followed by process 4, follow roughly the same
pattern in map view and cross-section, but the velocity models differ
significantly from each other.
Hypocenter locations
Figures 3.4 and 3.5 show maps of the epicentral locations obtained by the four
processes, and Figures 3.6 and 3.7 show vertical cross-sections of the
corresponding hypocenters along the box AB defined in Figures 3.4 and 3.5.
Also plotted on the maps are Bouguer
anomaly contours (smooth curves; Oliver and Robbins, 1978) and topographic
contours (rough curves). The respective relations among gravity,
topographic features, and epicentral locations are roughly maintained
in all four cases. The details, however, vary from case to case.
Figure 3.8 shows the histograms of rms errors of UNR catalog locations and locations
obtained by process 4. One can see the overall decrease in rms error.
The most consistent spatial distribution is the north-northwest trend of epicenters, well-defined in both the cases of simultaneous relocation and three-dimensional velocity estimation by linearized inversion (Figures 3.4b) and 3.5b). Cross-sections along AB in Figures 3.6 and 3.7 reflect the same consistency, though the cross-section for the nonlinear optimization case (Figure 3.7a) shows the locations much more scattered in comparison with the other three cases. The main event has been located at depths of 11, 11.5 and 10.5 km by processes 1, 2 and 4 respectively (Figures 3.6a, 3.6b and 3.7b). Process 3, whose input data consisted of arrivals to portable stations only, does not have a mainshock relocation. Deducing a unique fault plane through the hypocentral scatter in any of the cross sections is not straightforward. Relocated hypocenter locations from process 3 are much more scattered than those from the other 3 processes. Average rms residuals of location in processes 3 and 4 are 0.10 and 0.12 respectively. The mainshock has an rms residual of 0.05 in both the processes.
I infer two different planes of roughly same strike (165 degrees) from the scatter of points in the cross sections of Figures 3.6a, 3.6b and 3.7b. One plane, which does not contain the relocated mainshock, dips steeply (about 75 degrees) eastwards; and the other, including the relocated mainshock, dips slightly less steeply (about 60 degrees) westwards. Loper et al. (1993) report a normal mechanism for the mainshock obtained from regional surface wave inversion. By incorporating broadband data from several institutions to the data recorded by Berkeley Digital Seismic Network, they relocated the large aftershocks relative to the mainshock using a master event technique and suggest a relatively complex rupture history for the event based on examination of mainshock and aftershock records, and identify at least two subevents. Massonet and Feigl (1995) model both east and west dips with normal mechanism in their satellite synthetic aperture radar (SAR) interferometric work, though they rule out the eastward dip as a local minimum in their modeling scheme. Their best-fitting focal mechanism has a north-northwestward strike and westward dip. Another SAR study by Peltzer and Rosen (1995) confirms a west-dipping fault with a slightly NNE strike. They reconcile the observations of north-northwest alignment of epicenters and north-northeast strike of mapped local faults to infer that the earthquake occurred on a west-dipping north-northeast-striking fault plane with the rupture propagating diagonally upward and southward. The Caltech TERRAscope moment tensor (MT) and Harvard centroid-moment tensor (CMT) solutions give gentler westward dips. Table 1 summarizes the focal mechanism parameters determined by the different researchers along with those inferred in this study (using standard sign convention of strike and dip):
Table 1 Method Strike Dip Rake Reference (deg) (deg) (deg) TERRAscope MT -173 48 west - (Caltech website) Harvard CMT 210 30 west -93 (Dziewonski et al., 1994) SAR No. 1 173 54 west - (Massonet and Feigl, 1995) SAR No. 2 187 50 west - (Peltzer and Rosen, 1995) This study 165 60 west -One can see that the strike and dip obtained in my study match fairly well with the SAR results of Massonet and Feigl (1995). It is also not very different from the results of the other SAR study and TERRAscope except that their faults strike a little east of the north whereas mine does it north-northwestwards. My results differ considerably from the Harvard CMT solution, the latter giving a 45 degree more eastward strike and dip only half as steep. Finally, the rather complicated distribution of hypocenters on the east-west cross-section (Figures 3.6a, 3.6b and 3.7b), along with the somewhat curved distribution of epicenters (Figures 3.4a, 3.4b and 3.5b), can be the expression of a concave fault involvement and the wide range of values and directions of the strike and dip of the fault found by different methods may also be due to complications brought into play by the concave shape of the fault. A complex rupture history and the existence of two subevents as suggested by Loper et al. (1993) mentioned above can also be the reason behind the observed spatial configuration of the relocated hypocenters.
Table 2 Layer Velocity (km/s) Starting Depth (km) 1 4.79 0.00 2 5.13 2.00 3 5.58 4.00 4 5.84 6.00 5 5.96 8.00 6 6.09 12.00 7 6.21 16.00 8 6.50 20.00 9 7.85 30.00
Figures 3.9a and 3.9b show map views at 0 and 6 km depth respectively of the final three-dimensional velocity model obtained by simultaneous relocation and 3-D velocity estimation by linearized inversion with minimum 1-D model as the starting model (process 2 preceded by process 1 - Figure 3.3). Figures 3.10a and 3.10b show corresponding map views for the model obtained by 3-D linearized inversion with starting model estimated by simulated annealing (process 4 preceded by process 3 - Figure 3.3). In general, the velocity model from processes 1+2 has a higher average velocity than that from processes 3+4 (5.97 km/s versus 5.55 km/s). Velocity at surface in the first case varies from 3.6 km/s to 6.8 km/s whereas that in the second case varies from 2.7 km/s to 6.0 km/s only. There is quite good match in the spatial distribution of low and high velocity patches in the northeastern half of the the two maps, though the absolute velocities differ. In the southwestern half, whereas the map for processes 1+2 shows a more or less uniform velocity of about 4.8 km/s, that for processes 3+4 shows a 10 by 10 km high velocity patch of 5 km/s in a background of 4.4 km/s. In both the cases, the north-northwest trending epicenter distribution is underlain by a parallel relatively low velocity area flanked by higher velocities. The maps at 6 km depth have fewer features in common. A relatively low velocity north-northwest trend is present in the northeastern part of the maps. The trend is much better-defined for processes 3+4 than the other case. Also, the high velocity patch featured in the southwestern part of the surface map persists at 6 km. Figures 3.11a and 3.11b present standard errors with respect to velocity models at 6 km depth obtained by processes 2 and 4 respectively. The northeastern half of the models show higher errors than the southwestern half. Treatment of these errors without referring to the distribution of rays can be misleading. Figures 3.12a and 3.12b show the ray density in terms of derivative weight sum (DWS - discussed earlier) and diagonal elements of the resolution matrix at 6 km depth for process 4. The northeastern half shows much higher ray density compared with the other half. The highest resolution of 0.636 is obtained in the northern part of the model. Resolution in the southwestern part of the model is very low. Examining the velocity maps along with the ray density and resolution maps, I can say that even though the northeastern part of the model has higher errors its velocities are more reliable than those in the southwestern half because both ray density and resolution in the latter is low. It means that the 10 by 10 km high velocity block found in the southwestern part of the model by process 4 is not as reliable as the NNE low velocity trend found by both processes 2 and 4.
Figures 3.13 and 3.14 show vertical cross-sections of models obtained by processes 2 and 4 along CC' and DD' defined in Figures 3.9a and 3.10a. The hypocenters shown in the cross-sections are those located inside the volume limited by vertical planes along the dashed lines on either side of CC' and DD'. The low velocity trend observed on the map views are found as a basin in cross-section DD' of both the processes. The basin is about 8 km wide and 3-4 km deep. As observed in the case of map views, the absolute velocity of the basin in the case of process 4 is 0.8 km/s lower than that in the case of process 2. The mainshock is situated at depth under the western flank of the basin.
Because most of the hypocenters are shallower than 10 km, ray coverage at greater depths is poor and as such, the velocity cross-sections are not quite reliable at depths more than 10 km. The near-surface DWS is quite high in the eastern half of both sections CC' and DD' (not shown). Between 5 and 10 km depths the area with high DWS extends farther westwards. Therefore, the low velocity basin featuring in both map views and cross-sections for processes 2 and 4 is a structure supported by data, whereas the existence of the high velocity block found in process 4 is not equally supported by data. But its existence cannot be ruled out altogether. A better coverage of local stations in the southwest quadrant of the model would help in constraining the existence of the high velocity block (see Figure 3.1).
One can see a good correspondence of the gravity and topographic highs and lows with the high and low velocity areas in the velocity. The linear north-northwest low velocity trend coincides with the strike of the alluvial valley and also with a trend between two gravity lows. The 10 by 10 km high velocity block matches with a gravity high of comparable dimension.
In addition to qualitatively matching between trends in topography, gravity, and computed velocity model, I made a quantitative comparison of the fit between gravity anomalies and the two velocity models. Any gravity modeling from seismic velocity data first requires the velocities to be converted to densities. This conversion is almost always a problematic job because of the widely varying conditions under which different researchers have found their empirical relationships. Some of the formulas are derived entirely from studies on igneous and metamorphic rocks (Birch, 1961; Bateman and Eaton, 1967), while others are derived from exclusively sedimentary rock samples (Nafe and Drake, 1963; Gardner et al., 1974). Not a single one of these relationships can be used with perfect relevance to my study area. I used the following empirical formula of Gardner et al. (1974) to convert three-dimensional velocity models into three-dimensional density models:
= 0.23V0.25,
is density in g/cm3
and V is P wave velocity in ft/s.
The 3-D density models have as many blocks as the velocity models.
Then, using a reference density of 2.67 g/cm3 (this value was used for preparing
the Bouguer anomaly map of Oliver and Robbins, 1978), I computed gravity values at 1 km grid space at the surface by adding the
vertical components of the anomalies due to all the density blocks for each grid point (Grant and West, 1965).
Figures 3.15a and 3.15b show the computed gravity values (in gray scale) superimposed on the Bouguer gravity anomaly contour map of Oliver and Robbins (1978) for processes 2 and 4. Both the computed gravity maps show maximum contrasts twice as much as that in the Bouguer anomaly map. Whereas the maximum contrast on the Bouguer anomaly map is 25 mGal, those for processes 2 and 4 are 52 and 56 mGal respectively. Computed gravity values for process 4 are mostly negative, while those for process 2 are predominantly positive. Evidently this difference is due to the lower velocities of the process 4 model, which is on average 0.8 km/s slower than the process 2 model. The shape of the valley is modeled well by both the processes. The discrepancy between the magnitudes of gravity contrast on the Bouguer map and my modeled maps may suggest that even though outlines of the major low and high velocity regions have been obtained by the inversion, the details of lateral heterogeneity and of vertical variation of velocity have not been adequately discerned. Of course, another reason for the discrepancy lies in the inevitable difference between the physical conditions under which the empirical relationship was obtained and those prevailing in the Eureka Valley (exclusively sedimentary versus both sedimentary and, igneous and metamorphic rocks).
So far, I have shown that both processes 2 (linearized inversion with initial model obtained by minimum 1-D velocity modeling) and process 4 (linearized inversion with initial model obtained by nonlinear optimization by simulated annealing) map a few common features. The low velocity basin is mapped well by both the processes. But at depth the process 2 performs much more poorly than process 4 (Figures 3.9b and 3.10b). At 6km depth, the general appearance of the process 4 model is much more coherent and realistic. Also, the initial minimum 1-D model used for process 2 has a too high P wave velocity (4.79 km/s) at surface. Even though I attempted quite a few trial and error steps for the minimum 1-D modeling the results probably correspond to a local minimum. This is the inherent danger in any linearized inversion used for solving a nonlinear problem, especially in the absence of prior information. The relatively higher velocity obtained in the process 2 is reflection of the high velocity initial 1-D model.
One can see that the simulated annealing optimization provides a velocity model
that can be used as an initial model in a simultaneous linearized inversion of
velocity and hypocenters, to obtain realistic velocity models and
hypocentral locations. Hypocentral locations as such from the
optimization alone are very much scattered. This may be an effect of not
allowing the right degree of hypocentral perturbation during the
optimization of successive trial models. Pullammanappallil and Louie
(1993), in a similar simultaneous optimization of velocities and reflector
locations, found that inappropriate constraint of reflector ``hypocenters''
led to poor results. These observations led me to do simultaneous linearized
inversion for hypocenters and velocity using the velocity results of
nonlinear optimization as the initial velocity model, producing the successful
results above.
As a result of combining a linearized inversion with a nonlinear optimization in
tandem, I was able to obtain a realistic velocity model and fault
plane definition from hypocenters. The velocity model adequately describes
broad structural
features at the local scale. The nonlinear simulated annealing
optimization scheme, by virtue of its capability of finding the global error
minimum irrespective of any starting model, produced a velocity model for an
area like Eureka Valley where little is known about the local velocity
structure. The linearized inversion following the nonlinear optimization
could thus start from a fairly valid initial model and preliminary
hypocentral locations, and fine tune them.
3.5 Conclusions
I proposed and implemented an efficient optimization-cum-inversion algorithm
for studying an area with few a priori information.
P and S wave travel time data from a moderate earthquake sequence in
Eureka Valley, eastern California, allowed me to delineate the velocity
structure of the valley and surrounding areas, and define the fault plane
associated with the sequence. I find a north-northwest trending normal
fault dipping steeply westwards. This is in conformity with the dominant
type of slip mechanism prevalent in the southwestern part of the Great
Basin near the Sierra Nevada Range. Although I have enough confidence in
the timing of portable station data, their small number per event limited
the resolution of the results.
On the other
hand, the possibility of better results using a
higher number of arrivals at nearby permanent stations from the
seismic monitoring networks was compromised by lower confidence
in the picks.
Chapter 4
On critical temperature, perturbation scheme and
probabilistic acceptance of models in
simulated annealing optimization for
velocity and source location estimation
4.1 Introduction
The nonlinear nature of the problem of simultaneous determination of a subsurface
velocity model and earthquake hypocenter locations from travel times has pushed
researchers to devise ways of tackling the problem without resorting to indiscriminate
linearization, which oftentimes severely degrades the results. Monte Carlo methods
were devised with a view to handling nonlinear problems properly. But the purely
random way of exploring the model space in typical Monte Carlo methods proved
somewhat prohibitive for large models from a computational point view. Modified
Monte Carlo methods based on the principle of the so-called ``importance sampling'' provided a more effcient way of model space search. Simulated annealing (SA) and
genetic algorithm (GA) are two such methods. Of these two, SA enjoys a better
mathematical foundation and proof of convergence than GA (Sen and Stoffa, 1996).
In Chapter 3, I proposed an in-tandem combination of nonlinear optimization by simulated annealing (SA) and linearized inversion as a compromise between computation time economy, which is better achieved by linearized methods than fully nonlinear ones, and proper handling of the nonlinearity of the problem (which linearized methods do away with at the level of mathematical formulation of the problem). One major difficulty encountered in the SA optimization part of the scheme was its production of a fairly realistic velocity model but randomly scattered hypocenter locations. The linearized inversion following the SA trial used the SA-derived velocity model along with initial hypocenter locations from the preliminary catalog as an initial model for further fine-tuning. In this chapter I look in depth into the cooling schedule and scheme of parameter perturbation, and at the overall ``error dynamics'' for the type of SA code developed by Pullammanappallil (1994), with a view to improving its performance. I limit myself to synthetic tests on a simple two-dimensional model.
In the recent years, simulated annealing has gained increasing popularity in solving a wide spectrum of geophysical problems. Rothman (1985, 1986) solved the prevalent problem of ``cycle skips'' in linearized methods of residual statics computation of noisy data by following the Monte Carlo optimization technique of Metropolis et al. (1953) adapted by Kirkpatrick et al (1983) as simulated annealing. He proposed an extension of the the SA technique to the case of estimation of velocity. Use of simulated annealing in inversion of seismic travel times as well as waveforms for velocity estimation has since become an important area of research (Landa et al., 1989; Koren et al., 1991; Mosegaard and Vestergaard, 1991; Sen and Stoffa, 1991; Pullammanappallil and Louie, 1993, 1994).
A very important aspect in any problem of optimization by simulated annealing is the determination of the annealing schedule (also known as temperature or cooling schedule) to be used. The prefect annealing is a very slow cooling process that carefully favors ``crystallization'' and prevents formation of any ``non-crystalline'' structure from the ``melt'' obtained at a high temperature. From computational point of view, the same rate of slow cooling appears simply prohibitive in implementation. Search for parts of the annealing process that can accommodate faster cooling rates led to the inclusion of the concept of ``critical temperature'' into the annealing schedule. In physical annealing, the critical temperature is the temperature in the cooling process where rapid crystallization starts and a phase change occurs (Basu and Frazer, 1990). The cooling rate around this temperature has to be slow enough to favor the crystallization, but the temperature itself should also be high enough for the process to avoid defects (local minima) in the crystal. In simulated annealing, convergence near the critical temperature is rapid. The annealing schedule has to iterate enough above this temperature to ensure that potential local minimum entrapments are avoided. Basu and Frazer (1990) define the critical temperature as representing a balance point at which low-energy states are preferred, but the system is warm enough to tunnel between such states. They equate it to the temperature in the ``short runs'' at which the negative average energy (or the correlation coefficient) is maximum. Pullammanappallil's (1994) definition of the critical temperature is essentially the same. His critical temperature is the temperature in the ``short runs'' at which the average least square error is minimum.
Although there is some kind of consensus among the SA practitioners that a
fairly fast rate of cooling above the critical temperature can expedite the
annealing process without jeopardizing proper ``crystallization'' (convergence to
global minimum), there are considerable differences among them in its implementation. Geman and Geman (1984) use an inverse logarithmic function of the ``annealing time''
(1/log t). Basu and Frazer (1990), and Pullammanappallil (1994) find a
cascade function of constant temperatures with short ``staircases'' above and
significantly longer ``staircases'' below the critical temperature efficient
in solving seismic velocity estimation problems. While I follow the cascade-type
cooling schedule of Pullammanappallil (1994), I find his determination
of the critical temperature deficient in envisaging some situations
that can occur in the annealing schedule and affect the overall
convergence of the process in a detrimental way. In this chapter, I make a detailed
investigation into the process of determination of the critical temperature and
accommodate the abnormal situations into its definition. I also compare several
types of velocity and source location perturbation schemes. Finally, I propose
a slightly modified criterion of probabilistic acceptance of models.
4.2 Synthetic model and data
Two sets of synthetic first arrival travel times were computed
using 20 sources and 39 receivers in a 40 km by 20 km two-dimensional velocity
model.
I used Vidale's (1988) two-dimensional travel time computation code, which uses
a fast finite-difference scheme based on a solution to the eikonal equation.
The model in Figure 4.1a (henceforth called model 1) is used for velocity model
determination
only keeping the locations fixed. That in Figure 4.1b (henceforth called model 2) is used for source
location determination keeping the velocity fixed, and also for simultaneous
determination of velocity and source location determination.
Thus, we have a set of 780 travel times for each of the models.
This makes both the problems of velocity model determination (800 velocity model
parameters), and combined
velocity and source location determination (840 model parameters) under-determined. Only the problem of source location determination keeping the velocity fixed is over-determined (40 source coordinate parameters - 20 x and 20 z coordinates).
4.3 Perturbation schemes
In the following sections I will be referring to several schemes of model perturbation,
which are described below:
i) Scheme 1: The same amount of velocity perturbation is executed for all cells belonging to a rectangle of randomly chosen area (varying from zero to the whole model) at each iteration. The resultant velocity after perturbation for any iteration is randomly computed to lie in the range 1.50 km/s and 8 km/s (corresponding to slownesses of 0.667 s/km and 0.125 s/km). This range is much bigger than the range of velocities one expects for most of the cases. But such a big range proved helpful in avoiding entrapment in local minima and expediting convergence (Pullammanappallil, 1994).
ii) Scheme 2: The absolute x and z coordinates (with respect to x=0 and z=0) of one source are guessed separately and randomly in the 0-40 and 0-20 km ranges. Only one source, chosen randomly, is displaced at each iteration.
iii) Scheme 3: The absolute x coordinate of one source and z coordinate of another are guessed separately and randomly in the 0-40 and 0-20 km ranges. The two sources to be perturbed are chosen randomly at each iteration.
iv) Scheme 4: Maximum adjustment of 40 km for the x coordinate and 20 km for the z coordinate of one source with respect to present values is done at each iteration. The amount of x and z perturbations are computed randomly in the 0-40 and 0-20 km ranges respectively with the only constraint that the sources remain within the model area.
Smaller perturbations for the source coordinates, e.g., 5 km, performed very poorly and hence, all the tests I perform in the following sections are limited to the above four types of perturbations.
A least square error type objective function is used in the optimizations discussed in the following sections. For any iteration i, it is defined as:
As described earlier, two sets of synthetic first arrival travel times were computed
using 20 sources and 39 receivers in a 40 km by 20 km two-dimensional velocity
model.
For each of the 7 combinations, I show four curves, two describing the shape
of the least square error as a function of iteration for the accepted models,
one showing the average least square errors of all accepted models at each of the eight
temperatures, and one showing the number of accepted models for those
temperatures.
Velocity model perturbation only:
In this case, source locations were assumed to be known (those in Figure 4.1a)
and the velocity model is to be determined from the 780 synthetic travel times
computed through the model 1. A constant initial velocity of 6.0 km/s is used and velocity perturbation is executed according to scheme 1 described in section 4.3.
Least square errors as a function of iteration number showed a highly fluctuating pattern within the range 0.0 to 100.0 for temperatures 10 down to 0.1. At
temperature 0.01 the shape suddenly changed to a more or less smooth and monotonically decreasing curve.
Figures 4.2a and 4.2b show the shapes of error versus iteration number
at temperatures 0.1 and 0.01. Figure 4.2c shows the average error versus temperature and Figure 4.2d shows the number of accepted models versus temperature curves.
One can see that the change of shape of the error-versus-iteration curve is
reflected also in the average error-versus-temperature and number of accepted
models-versus-temperature curves. Both the curves show a steep ramp between
temperatures 0.1 and 0.01 and very negligible slope on either side of the ramp
(at temperatures higher than 0.1 and lower than 0.01). This is the typical
behavior in most of the cases. The lowest average error corresponds to the
temperature (0.01) giving the lowest number of accepted models and showing an abrupt
change from an oscillating pattern at the next higher temperature to a smoothly
decreasing pattern. Both the average error and the number of accepted models
maintain a constant value from temperature 0.01 downwards. According to Pullammanappallil's (1994) and Basu and Frazer's
(1990) definitions (henceforth referred to as the PBF criterion) the critical temperature for this case would be 0.01. Their
only criterion being average error, they do not look at the two other criteria,
e.g., change of shape of the error-versus-iteration curve from temperature to temperature, and sudden drop in the number of accepted models versus temperature, that I propose to examine. In the present case,
depending only on the average error criterion seems to be sufficient. But this
is not always the case, as I will show later.
Source location perturbation only:
In this case, the velocity model was assumed to be known (Figure 4.1b)
and the source locations are to be determined from the 780 synthetic travel times
computed through the model 1. At the beginning of the optimization, all the sources
are assigned to be located at the center of the model (x = 20 km and z = 10 km).
Source locations are perturbed using schemes 2, 3 and 4 described in section 4.3.
Figures 4.3, 4.4 and 4.5 show the results of the short runs using schemes 2, 3 and 4
respectively.
One can see the same fluctuating pattern in the curve showing least square errors as a function of iteration number in the temperature range 10 down to 0.1 as
in the case of velocity model determination only. But the the amplitude of
fluctuation is much smaller in this case (Figures 4.3a, 4.4a and 4.5a), evidently due to the small number of
parameters (40 source coordinates) being perturbed compared to a significantly higher number of parameters (800 velocity values) in the earlier case.
Of course, velocity and source location perturbations have different ways of
affecting the error values. Perturbation of a source parameter is bound to change
the error, whereas it is possible to have velocity model perturbations in the
unsampled part of the model without any change of error.
The change in the shape of error-versus-iteration curves occur between temperatures
of 0.1 and 0.01 in these cases also. The shape of this curve at temperature 0.01 has
an overall monotonically decreasing pattern with some remnant low amplitude
oscillations in the first 50-60 iterations for schemes 2 and 4. (Figures 4.3b and 4.5b). Figures 4.3c-4.3d, 4.4c-4.4d and 4.5c-4.5d
show average error and number of accepted models as functions of temperature for the three perturbation schemes.
The patterns in these two curves also closely mimic those in the case of velocity
perturbation only (Figure 4.2c-4.2d). Of course, one can observe a gentler slope of the ramp
in the average error curve in this case, conspicuously related to the small
amplitude of least square errors (< 5.0 s2).
The lowest average error for scheme 2 is achieved at temperature 0.0001 where it
is almost negligibly lower than that at 0.001. The average error in the
case of perturbation scheme 3 reaches a minimum at temperature 0.01 and increases
very slightly at lower temperatures, whereas that for scheme 4 reaches the lowest
value at 0.001 and remains constant for lower temperatures. The number of
accepted models reaches the lowest value at temperatures 0.001, 0.0001 and 0.001
for schemes 2, 3 and 4 respectively. Only in the case of scheme 3
is there correspondence between the lowest error and lowest number of accepted models.
My 3 criteria and the single PBF criterion
give the same critical temperature (0.01) in the case of scheme
3. My suggested critical temperatures for both schemes 2 and 4 are 0.01 whereas
those according to PBF are 0.0001 and 0.001. In view of the low number of accepted
models for temperatures 0.01 downwards for all the three schemes, it is evident
that the choice of critical temperatures depending solely on the lowest average error
criterion results in a much smaller number of accepted models at several temperatures higher than the chosen critical temperature for schemes 2 and 4,
and consequent insufficient evolution of the model before the cooling schedule reaches the
critical temperature.
Simultaneous perturbation of the velocity model and source locations:
In this case, both the velocity model and source locations are assumed unknown (Figure 4.1b)
and are to be determined from the 780 synthetic travel times
computed through the model 1.
A constant initial velocity of 6.0 km/s and constant initial location (x = 20 km and z = 10km) for all the sources are used.
Source location perturbation is done using the same three schemes (2, 3 and 4)
as in the previous section.
Velocity perturbation in each of these three cases follows that in scheme 1.
Figures 4.6, 4.7, and 4.8 show the results of this test for the three schemes. All the four curves in the case of scheme 2 (Figures 4.6a-4.6d) seem to be somewhat like weighted averages of the corresponding curves of the velocity-only and source-only perturbation experiments (Figures 4.2a-4.2d and 4.3a-4.3d). The exact mathematical nature of the weighting involved is not clear. I presume it depends mostly on the random number sequences used in the three cases (The number of random numbers used in each iteration of the present case is roughly equal to the sum of those used in the corresponding iteration of the two previous cases). My critical temperature in this case is 0.01 whereas that according to PBF criterion is 0.001. In this case, again, the PBF criterion causes a decrease in the number of accepted models.
The results of experiments using schemes 3 and 4 both reveal a pathological computational phenomenon that I call a ``glitch'' (Figures 4.7b and 4.8b). Whereas the patterns of error-versus-iteration curve at temperature 0.01 for both the schemes are fairly monotonically decreasing for more than 1000 iterations, the error abruptly jumps from 0.266 s2 at iteration 1236 to 31.85 s2 at iteration 1238 for scheme 3, and from 0.32 s2 at iteration 1188 to 20.96 s2 at iteration 1193 for scheme 4. After this ``glitch'' the error again assumes the roughly hyperbolical decreasing pattern. After investigation, I found that these glitches were caused by an unexpected combination of the differential error (i.e, difference between the error of the present iteration and that of the iteration at which the last model was accepted) and the random number used for determining the probability of acceptance. An immediate effect of the occurrence of the glitches is an increased average error and number of accepted models at temperature 0.01 (Figures 4.7c-4.7d and 4.8c-4.8d). As a consequence, the ramp between high average errors at high temperatures and low errors at low temperatures becomes wider than those in all other cases thus far studied.
Even though the number of accepted models increases slightly due to the glitch, the ramp between high and low rates of model acceptance retains roughly the same shape as in the previous cases. What should be the critical temperatures in these two cases? Consideration of the average error as a function of temperature with the exclusion of the shapes of error-versus iteration curve for each temperature and number of accepted models-versus-temperature curve will be very misleading in this case. The average errors at temperatures 0.01 and 0.001 for scheme 3 are 5.53 s2 and 0.71 s2 respectively. One can easily see that the average error in this case is caused by the glitch and does not reflect the ``normal error level'' at this temperatures (the average error over the first 1236 iterations of the short run at temperature 0.01 is only 0.89 s2). The significant difference between the average errors at temperatures 0.01 and 0.001 (5.53 versus 0.71, Figure 4.7c) would, in the absence of other criteria, absolutely justify the choice of 0.001 as the critical temperature. But the unusual glitch occurring in the short run and causing the high value of average error was an artifact of a rare combination of numbers at a particular iteration of the short run, and in its absence (which will, most certainly, be the case in the actual long run) the number of accepted models will be low. For the optimization to sample large enough number of models, a judicious choice of the critical temperature would be 0.01 supported by the criteria of change of pattern of error-versus-iteration curves (Figures 4.7a and 4.7b) and slope of the number of accepted models-versus-temperature curve (Figure 4.7d)
Figure 4.9 shows the evolution of the shape of the error-versus-iteration curves
with temperature for a three-dimensional ``blob'' model (a constant high velocity cube
in a lower velocity background) using a three-dimensional perturbation scheme similar to scheme 4 for a short run of 1000 iterations (Asad and Louie, 1996). The number of unknown
parameters (velocity cells and source coordinates) in this case was 1770. Figures
4.10a and 4.10b show the average error and number of accepted models at different
temperatures. In this case too, the PBF criterion suggests a lower critical
temperature than would be determined if all the three criteria that I propose
are considered.
4.5 Probabilistic acceptance of models
In any realization of simulated annealing the system is ``warmed'' until it is
completely ``melted''. In implementation it means starting the optimization at a
sufficiently high temperature where the probability of acceptance of models with
bigger errors is high, and thus ensuring that any bias to the initial model is
destroyed. By keeping an overall high error during the iterations of the optimization at high temperatures, premature entrapment in local minima are also
avoided. The cooling schedule has to be designed around such a choice of the
critical temperature that there are enough number of iterations at higher temperatures
facilitating ``escape'' from potential local minima, but not so many as can
unnecessarily delay the decrease of error towards the global minimum. Every SA
practititioner faces a dilemma here between the needs of escaping local minima
and preservation of model improvement, the latter being roughly defined as decrease in error.
A high number of accepted models causes a good evolution of the model by extensive sampling of the model space. A substantial part of the evolution of the model parameters take place at high temperatures. The rate of evolution decreases with decreasing temperature. Therefore, if the error level does not come down to a significantly low level before the optimization enters the low temperatures, it will be very difficult for the optimization process to reach the neighborhood of the global minimum. That is why proper choice of a critical temperature is so crucial.
In this section, I propose a heuristic combination of the ubiquitous ``acceptance'' criterion and a new ``rejection'' criterion for the probabilistic treatment of the issue of acceptance of new models with a view to handling the above-mentioned dilemma.
The general rule in simulated annealing methods is that a new model with a smaller error than that of the current model is accepted unconditionally and one with bigger error than that of the current model is conditionally accepted with a probability of
E/T),
E is the absolute value of the error difference and T is the temperature.
In practice, a random number selected from a uniformly distributed series
in the interval (0,1) is compared with Pc (Kirkpatrick et al., 1983). If it is less than Pc, the new model is accepted; if not, the currently accepted model is retained.
I use this scheme of probabilistic acceptance of models with higher error for
all the iterations (several thousands) in the starting temperature. This will
achieve the state of unbias to the initial model and also let the optimization
sample the model space to certain extent. Starting from the next lower temperature
I use what I call ``rejection'' criterion, according to which any model with
higher error than that of the currently accepted model is rejected unconditionally, while a model with lower error is accepted probabilistically. The same
probability function including the same definition of
E as above is used in this case too. The difference comes in the part of comparison with the
random number. In this case, the random number has to be bigger than Pc
for the model to be accepted. This essentially preserves the Gibbs distribution
nature of the probability.
In the acceptance criterion, more models with higher error are accepted at higher than at lower temperatures. In the rejection criterion more models with lower
error are rejected at higher temperatures than at lower temperatures.
In the acceptance criterion, the lower
E the higher the probability of
accepting a model with higher error, whereas, in the rejection criterion,
the higher
E the higher the probability of accepting a model with
lower error. Both acceptance and rejection based on probability
almost ceases at low temperatures resulting in lower error being the only and sufficient
practical condition of acceptance of new models.
Three advantages of the use of rejection criterion are:
The main disadvantage of using the rejection criterion is that the ``course of
evolution'' of the model is much shorter compared to that in the acceptance
criterion because of the difference in the number of models accepted in the two
cases. This is partly compensated for in the initial ``warming-to-melt'' stage of the
optimization where I use the acceptance criterion at the initial high temperature for
a few thousand iterations. The number of iterations necessary will depend on
the size of the model.
4.6 Results
I conducted long runs (20000 iterations) of simulated annealing for the following
cases:
Velocity model determination only:
I execute two runs, one using the standard ``acceptance'' criterion, and the other
using the combined ``acceptance-rejection'' criterion that I proposed in
section 4.5, for determining the velocity model from the 780 synthetic travel times
computed through the model 1. The source locations are all assumed known. A constant initial velocity model of 6.0 km/s is used.
The total number of accepted models in the standard run was 3457, of which 1846 were accepted with higher errors. Figure 4.11a shows the velocity model obtained at the 18957th iteration. The average least square of the 780 travel times at this iteration is 0.0007 s2. For the sake of comparison I plot the velocity using the same gray scale as that used for the true model (Figure 4.1a). The low velocity anomaly in the right part of the model is quite well recovered, but recovery of the high velocity anomaly at the left part is very poor.
Figure 4.11b shows the velocity model at the 16504th iteration of the combined
acceptance-rejection run. It corresponds to a least square error of 0.0008 s2. The total number of accepted models in this case was 4026, of which 2133
were accepted with high errors in the initial ``warming'' part of the process. 73
models were rejected. In this case, the high velocity anomaly at the left is
quite well recovered whereas the low velocity anomaly becomes more diffuse. However, the overall recovery of anomaly in this case is more balanced than the other case.
Figure 4.12 shows the histograms of final time errors for the two cases.
Source location determination only:
I execute four different long runs for determining source locations assuming the
velocity model to be known. Three of these four runs are conducted using perturbation
schemes 2, 3 and 4 and the standard ``acceptance'' criterion. The fourth
one was conducted using perturbation scheme 3 and the combined acceptance-rejection
criterion. The initial locations of all the sources for each of the four cases
were set at x = 20 km and z = 10 km. Figure 4.13 shows the results of the four runs. Figure 4.13a shows
the true velocity model and source locations. Figure 4.13b shows the locations
obtained by the four runs along with the true source locations. The 20 source locations in all the cases (including the true locations) are
connected by straight line segments in increasing original order of x in order to
facilitate comparison. At the first glance, the overall pattern of source
profiles obtained in all the cases seem to be close to the true profile except
for two sources in the right part of the profile obtained by scheme 4. These two
sources are offset by about 15 km in the x coordinate. The best locations are
obtained by the scheme 3 using standard ``acceptance'' criterion (mean absolute shift of 0.33 km for x and 0.25 km for z). Table 1 gives the mean absolute shift
in x and z for the four runs:
Table 1 Scheme Criterion x (km) z (km) 2 acceptance 0.41 0.33 3 acceptance 0.33 0.25 4 acceptance 2.08 0.56 4 acceptance 0.68 0.43 * 3 acceptance-rejection 0.44 0.29 * excluding the two anomalous sources
In all the runs the z coordinate is more accurately determinated than the x
coordinate. The higher accuracy of the scheme 3 compared to scheme 2 suggests
that uncoupled perturbation of x and z coordinates of the same source is more
efficient in source location determination than coupled perturbation. The
poorest quality of locations obtained by the scheme 4 reveals the limitation
of adjustment of previous coordinates in contrast to determination of absolute
coordinates at each iteration. The errors in locations obtained by scheme 3 and
acceptance-rejection criterion are comparable to those obtained by scheme 2 and
acceptance criterion.
Simultaneous determination of the velocity model and source locations:
In this case, the same 4 types of source location perturbation as discussed above are each
accompanied by a velocity perturbation following scheme 1 for simultaneously
determining the velocity model and the source locations. As in the tests of
critical temperature in section 4.3, a constant initial velocity of 6.0 km/s
and constant initial location (x = 20 km and z = 10km) for all the sources are used for long optimization runs of 20000 iterations.
Figure 4.14 shows the locations obtained for the four schemes. There is obvious deterioration of the quality of the location results in all the schemes with respect to those in the case of determination of source location only (Figure 4.13). Here, most of the source locations are confined towards the center of the model. Despite the considerably shifted locations for individual sources, the source location profiles still retain some resemblance to the true profile. Table 2 gives the mean absolute shift in x and z coordinates in the four cases:
Table 2 Scheme Criterion x (km) z (km) 2 acceptance 2.32 3.31 3 acceptance 2.42 4.17 4 acceptance 6.29 6.79 3 acceptance-rejection 2.62 2.51
Looking at the Table 2 one can easily conclude that the combination of the perturbation scheme 3 and acceptance-rejection criterion gives the best locations. Performance of the scheme 4 is by far the worst. This is also evident from the Figure 4.14b.
Figures 4.15, 4.16, 4.17, and 4.18 show the velocity models obtained in the four cases. The upper panel in each figure shows the velocity model in the same gray scale as used for the true model in Figure 4.1b. The gray scale level in the lower panel corresponds to the range of velocities in the resultant model. The true and obtained source profiles are plotted on the velocity model in each case. A common feature of all the obtained models is their overall bias towards low velocities. A relatively higher velocity patch (compared to the average velocity) is observed to appear in the left part of the three models obtained by the acceptance criterion. The resemblance between the true and obtained models ends here. As to the velocity model obtained by the scheme 3 and acceptance-rejection criterion it has no resemblance with the true model. Joint examination of the source profile and velocity model obtained in each case reveals the ``trade-off'' between them. The best source location and worst velocity model correspond