Content-Type: multipart/mixed; boundary="-------------9901301419309"

This is a multi-part message in MIME format.

---------------9901301419309
Content-Type: text/plain; name="99-37.comments"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="99-37.comments"

This is an update of 97-457 and appears in Physics Reports 
310, 1-96 (1999). A summary appears in the Notices of the AMS 
45}, 571-581 (1998), mp_arc 98-339.
---------------9901301419309
Content-Type: text/plain; name="99-37.keywords"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="99-37.keywords"

Second Law, Thermodynamics, Entropy
---------------9901301419309
Content-Type: application/x-tex; name="secondlaw_esi_99.tex"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="secondlaw_esi_99.tex"

%%%%%%%%%%%%%%%%%%%%
%%THIS IS A PLAIN TEX FILE. IT IS SELF CONTAINED 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%CORRECTIONS OF THE VERSION OF AUG. 06 1998 
%%MADE IN ACCORD WITH THE GALLEY PROOFS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\magnification=\magstephalf
%\magnification=\magstep1
\baselineskip=3ex
%\baselineskip=4ex
\raggedbottom
\overfullrule=0pt
\font\fivepoint=cmr5
%\headline={\hfill{\fivepoint  EHLJY 05/Jan/99}}
\input epsf.sty
\def\d{{\rm d}}
\def\N{{\cal N}}
\def\T{{\cal T}}
\def\D{{\cal D}}
\def\I{{\cal I}}
\def\sr{{\cal R}}
\def\R{{\bf R}}
\def\S{{\cal S}}
\def\simt{\mathrel{\rlap{\hbox{$\sim$}}\raise.9ex\hbox{{\fivepoint
$\,$T}}}}
\def\sima{\mathrel{\rlap{\hbox{$\sim$}}\raise.95ex\hbox{{\fivepoint
$\,$A}}}}
\def\lanbox{\hbox{$\, \vrule height 0.25cm width 0.25cm depth 0.01cm 
\,$}}
\def\uprho{\raise1pt\hbox{$\rho$}}
\def\mfr#1/#2{\hbox{${{#1} \over {#2}}$}}
\def\boun{$\partial  A_X$}

\font\subsubt=cmtt10 scaled \magstep1
\font\subt=cmbx10 scaled \magstep1
\font\tit=cmbx10 scaled \magstep2 \font\eightpoint=cmr8
\font\fivepoint=cmr5
\font\sixpoint=cmr6
\font\ninepoint=cmr9
\font\sevenpoint=cmr7

\catcode`@=11
\def\eqalignii#1{\,\vcenter{\openup1\jot \m@th
\ialign{\strut\hfil$\displaystyle{##}$&
        $\displaystyle{{}##}$\hfil&
        $\displaystyle{{}##}$\hfil\crcr#1\crcr}}\,}
\catcode`@=12

%this is for automatic equation numbering
\def\eqlbl#1{\global\advance\equno by 1
  \global\edef#1{{\number\chno.\number\equno  }}
  (\number\chno.\number\equno  )}
%\eqno\eqlbl\fermat This is an example of usage
\newcount\chno    \chno=0
\newcount\equno   \equno=0


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%



\centerline{\tit THE PHYSICS AND MATHEMATICS OF}
\medskip
\centerline{\tit THE SECOND LAW OF THERMODYNAMICS}
\bigskip
\bigskip

\centerline{Elliott H. Lieb\footnote{$^*$}{\sixpoint Work partially
supported by U.S. National Science Foundation grant PHY95-13072A01.} }
\centerline{\it Departments of Physics and Mathematics, Princeton 
University}
\centerline{\it Jadwin Hall,  P.O. Box 708, Princeton, NJ  08544, USA}
\bigskip

\centerline{Jakob Yngvason
\footnote{$^{**}$}{\sixpoint Work partially
supported by the Adalsteinn Kristjansson Foundation, 
University of Iceland.} }
\centerline{\it Institut f\"ur Theoretische Physik, Universit\"at Wien,}
\centerline{\it Boltzmanngasse 5, A 1090 Vienna, Austria}
\footnote{}{\baselineskip=0.6\baselineskip\hskip -\parindent\sixpoint
\copyright  1997 by the authors.
Reproduction of this article, by any means, is permitted for 
non-commercial
purposes.\par}
\bigskip
\bigskip
{\narrower\smallskip\noindent

\bigskip\bigskip\noindent 

{\subt Abstract:} The essential postulates of classical thermodynamics 
are formulated, from which the second law is deduced as the principle 
of increase of entropy in irreversible adiabatic processes that take 
one equilibrium state to another.  The entropy constructed here is 
defined only for equilibrium states and no attempt is made to define 
it otherwise.  Statistical mechanics does not enter these 
considerations.  One of the main concepts that makes everything work 
is the comparison principle (which, in essence, states that given any 
two states of the same chemical composition at least one is 
adiabatically accessible from the other) and we show that it can be 
derived from some assumptions about the pressure and thermal 
equilibrium.  Temperature is derived from entropy, but at the start 
not even the concept of `hotness' is assumed.  Our formulation offers 
a certain clarity and rigor that goes beyond most textbook discussions 
of the second law.  }
\bigskip
\bigskip
\bigskip
1998 PACS: \  05.70.-a
\smallskip
Mathematical Sciences Classification (MSC) 1991 and 2000: 80A05,\ 80A10
%\centerline{\tit (DRAFT)}

\bigskip\bigskip\bigskip\bigskip\bigskip\bigskip\bigskip\bigskip\bigskip
This paper is scheduled to appear in Physics Reports {\bf 310}, 1-96 (1999)
\vfill\eject

{\vbox
{\ninepoint 
\baselineskip=0.9\baselineskip

\noindent
I. INTRODUCTION
\item{A.} The basic Questions\dotfill 3
\item{B.} Other approaches\dotfill 6
\item{C.} Outline of the paper\dotfill 10
\item{D.} Acknowledgements\dotfill 11\break


\noindent 
II. ADIABATIC ACCESSIBILITY AND CONSTRUCTION OF ENTROPY   
\item{A.} Basic concepts\dotfill 12
\item\item{1.} Systems and their state spaces\dotfill 13
\item\item{2.} The order relation\dotfill 16
\item{B.}  The entropy principle\dotfill 18
\item{C.} Assumptions about the order relation\dotfill 20
\item{D.} The construction of entropy for a single system\dotfill 23
\item{E.} Construction of a universal entropy in the absence
           of mixing\dotfill 27
\item{F.} Concavity of entropy\dotfill 30
\item{G.} Irreversibility and Carath\'eodory's principle\dotfill 32
\item{H.} Some further results on uniqueness\dotfill 33\break

\noindent 
III. SIMPLE SYSTEMS
\item{{\phantom{A.}}} Preface\dotfill 36
\item{A.} Coordinates for simple systems\dotfill 37
\item{B.} Assumptions about simple systems\dotfill 39
\item{C.} The geometry of forward sectors\dotfill 42\break 

\noindent 
IV. THERMAL EQUILIBRIUM
\item{A.} Assumptions about thermal contact\dotfill 51
\item{B.} The comparison principle in compound systems\dotfill 55
\item\item{1.} Scaled products of a single simple system\dotfill 55
\item\item{2.} Products of different simple systems\dotfill 56
\item{C.} The role of transversality\dotfill 59\break

%\vfill\eject
\noindent 
V. TEMPERATURE AND ITS PROPERTIES
\item{A.} Differentiability of entropy and the definition of 
          temperature\dotfill 62
\item{B.} The geometry of isotherms and adiabats\dotfill 68
\item{C.} Thermal equilibrium and the uniqueness of entropy\dotfill
             69\break

\noindent 
VI. MIXING AND CHEMICAL REACTIONS
\item{A.} The difficulty in fixing entropy constants\dotfill 72
\item{B.} Determination of additive entropy constants\dotfill 73\break

\noindent
VII.  SUMMARY AND CONCLUSIONS\dotfill 83\break

\noindent
LIST OF SYMBOLS \dotfill 88\break

\noindent
INDEX OF TECHNICAL TERMS
\dotfill 89\break

\noindent
REFERENCES\dotfill 91\break

}}



\vfill\eject
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\noindent
{\tit I. INTRODUCTION}
\bigskip

The second law of thermodynamics is, without a doubt, one of the most 
perfect laws in physics.  Any {\it reproducible} violation of it, 
however small, would bring the discoverer great riches as well as a 
trip to Stockholm.  The world's energy problems would be solved at one 
stroke.  It is not possible to find any other law (except, perhaps, 
for super selection rules such as charge conservation) for which a 
proposed violation would bring more skepticism than this one.  Not 
even Maxwell's laws of electricity or Newton's law of gravitation are 
so sacrosanct, for each has measurable corrections coming from quantum 
effects or general relativity.  The law has caught the attention of 
poets and philosophers and has been called the greatest scientific 
achievement of the nineteenth century.  Engels disliked it, for it 
supported opposition to dialectical materialism, while Pope Pius XII 
regarded it as proving the existence of a higher being (Bazarow, 1964, 
Sect.  20).


\bigskip\noindent
{\subt A. The basic questions}
\bigskip

In this paper we shall attempt to formulate the essential elements of 
{\it classical } thermodynamics of equilibrium states and deduce from 
them the second law as the principle of the increase of entropy.  
`Classical' means that there is {\it no mention of statistical 
mechanics here} and `equilibrium' means that we deal only with states 
of systems in equilibrium and do not attempt to define quantities such 
as entropy and temperature for systems not in equilibrium.  This is 
not to say that we are concerned only with `thermostatics' because, as 
will be explained more fully later, arbitrarily violent processes are 
allowed to occur in the passage from one equilibrium state to another.

Most students of physics regard the subject as essentially perfectly 
understood and finished, and concentrate instead on the statistical 
mechanics from which it ostensibly can be derived.  But many will 
admit, if pressed, that thermodynamics is something that they are sure 
that someone else understands and they will confess to some misgiving 
about the logic of the steps in traditional presentations that lead to 
the formulation of an entropy function.  If classical thermodynamics 
is the most perfect physical theory it surely deserves a solid, 
unambiguous foundation free of little pictures involving unreal Carnot 
cycles and the like.  [For examples of `un-ordinary' Carnot cycles see 
(Truesdell and Bharatha 1977, p.~48).]

There are two aims to our presentation.  One is frankly pedagogical, 
i.e., to formulate the foundations of the theory in a clear and 
unambiguous way.  The second is to formulate equilibrium 
thermodynamics as an `ideal physical theory', which is to say a theory 
in which there are well defined mathematical constructs and well 
defined rules for translating physical reality into these constructs; 
having done so the mathematics then grinds out whatever answers it can 
and these are then translated back into physical statements.  The 
point here is that while `physical intuition' is a useful guide for 
formulating the mathematical structure and may even be a source of 
inspiration for constructing mathematical proofs, it should not be 
necessary to rely on it once the initial `translation' into 
mathematical language has been given.  These goals are not new, of 
course; see e.g., (Duistermaat, 1968), (Giles, 1964, Sect.  1.1) and 
(Serrin, 1986, Sect.  1.1).

Indeed, it seems to us that many formulations of thermodynamics, 
including most textbook presentations, suffer from mixing the physics 
with the mathematics.  Physics refers to the real world of experiments 
and results of measurement, the latter quantified in the form of 
numbers.  Mathematics refers to a logical structure and to rules of 
calculation; usually these are built around numbers, but not always.  
Thus, mathematics has two functions: one is to provide a transparent 
logical structure with which to view physics and inspire experiment.  
The other is to be like a mill into which the miller pours the grain 
of experiment and out of which comes the flour of verifiable 
predictions.  It is astonishing that this paradigm works to perfection 
in thermodynamics.  (Another good example is Newtonian mechanics, in 
which the relevant mathematical structure is the calculus.)  Our 
theory of the second law concerns the mathematical structure, 
primarily.  As such it starts with some axioms and proceeds with rules 
of logic to uncover some non-trivial theorems about the existence of 
entropy and some of its properties.  We do, however, explain how 
physics leads us to these particular axioms and we explain the 
physical applicability of the theorems.

As noted in I.C below, we have a total of 15 axioms, which might seem
like a lot. We can assure the reader that any other mathematical
structure that derives entropy with minimal assumptions will have at
least that many, and usually more. (We could roll several axioms into
one, as others often do, by using sub-headings, e.g., our A1-A6 might
perfectly well be denoted by A1(i)-(vi).) The point is that we leave
nothing to the imagination or to silent agreement; it is all laid out.

It must also be emphasized that our desire to clarify the structure of
classical equilibrium thermodynamics is not merely pedagogical and not
merely nit-picking. If the law of entropy increase is ever going to be
derived from statistical mechanics---a goal that has so far eluded the
deepest thinkers---then it is important to be absolutely clear about what
it is that one wants to derive.

Many attempts have been made in the last century and a half to
formulate the second law precisely and to quantify it by means of an
entropy function. Three of these formulations are classic (Kestin,
1976), (see also Clausius (1850), Thomson (1849)) and they can be paraphrased as 
follows:
\smallskip

{\sl Clausius:\/} No process is possible, the sole result of which is
that heat is transferred from a body to a hotter one.

{\sl Kelvin (and Planck):\/} No process is possible, the sole result 
of  which is that a body is cooled and work is done.

{\sl Carath\'eodory:\/} In any neighborhood of any state there are
states that cannot be reached from it by an adiabatic process.
\smallskip

The crowning glory of thermodynamics is the quantification  of these
statements by means of a precise, measurable quantity called entropy. 
There are two kinds of  problems, however. One is to give a precise
meaning to the words above. What is `heat'? What is `hot' and `cold'?
What is `adiabatic'?  What is a `neighborhood'? Just about the only word
that is relatively unambiguous is `work' because it comes from
mechanics. 

The second sort of problem involves the rules of logic that lead from
these statements to an entropy. Is it really necessary to draw pictures,
some of which are false, or at least not self evident?  What are all the
hidden assumptions that enter the derivation of entropy? For instance,
we all know that discontinuities can and do occur at phase transitions,
but almost every presentation of classical thermodynamics is based on
the differential calculus (which presupposes continuous derivatives),
especially (Carath\'eodory, 1925) and (Truesdell-Bharata, 1977, p.xvii).

We note, in passing, that the Clausius, Kelvin-Planck and 
Carath\'eodory formulations are all assertions about {\it impossible} 
processes.  Our formulation will rely, instead, mainly on assertions 
about {\it possible} processes and thus is noticeably different.  At 
the end of Section VII, where everything is succintly summarized, the 
relationship of these approaches is discussed.  This discussion is 
left to the end because it it cannot be done without first presenting 
our results in some detail.  Some readers might wish to start by 
glancing at Section VII.

Of course we are neither the first nor, presumably, the last to present
a derivation of the second law (in the sense of an entropy principle)
that pretends to remove all confusion and, at the same time, to achieve
an unparalleled  precision of logic and structure. Indeed, such attempts
have multiplied in the past three or four decades.  These other
theories, reviewed in Sect. I.B, appeal to their creators as much as
ours does to us and we must therefore conclude that ultimately a
question of `taste' is involved.

It is not easy to classify other approaches to the problem that
concerns us.  We shall attempt to do so briefly, but first let us state
the problem clearly.  Physical systems have certain states (which
always mean equilibrium states in this paper) and, by means of certain
actions, called {\it adiabatic processes}, it is possible to change the
state of a system to some other state.  (Warning: The word `adiabatic'
is used in several ways in physics. Sometimes it means `slow and
gentle', which might conjure up the idea of a quasi-static process, but
this is certainly not our intention.  The usage we have in the back of
our minds is `without exchange of heat', but we shall avoid defining
the word `heat'.  The operational meaning of `adiabatic' will be
defined later on, but for now the reader should simply accept it as
singling out a particular class of processes about which certain
physically interesting statements are going to be made.) Adiabatic
processes do not have to be very gentle, and they certainly do not have
to be describable by a curve in the space of equilibrium states. One is
allowed, like the gorilla in a well-known advertisement for luggage, to
jump up and down on the system and even dismantle it temporarily,
provided the system returns to some equilibrium state at the end of the
day.  In thermodynamics, unlike mechanics, not all conceivable
transitions are adiabatic and it is a nontrivial problem to
characterize the allowed transitions.  We shall characterize them as
transitions that have no {\it net} effect on other systems except that
energy has been exchanged with a mechanical source.  The truly
remarkable fact, which has many consequences, is that for every system
there is a function, $S$, on the space of its (equilibrium) states,
with the property that one can go adiabatically from a state $X$ to a
state $Y$ if and only if  $S(X) \leq S(Y)$. This, in essence,  is the
`entropy principle' (EP) (see subsection II.B).

The $S$ function can clearly be multiplied by an arbitrary constant 
and still continue to do its job, and thus it is not at all obvious 
that the function $S_1$ for system $1$ has anything to do with the 
function $S_2$ for system $2$.  The second remarkable fact is that the 
$S$ functions for all the thermodynamic systems in the universe can be 
simultaneously calibrated (i.e., the multiplicative constants can be 
determined) in such a way that the entropies are {\it additive}, i.e., 
the $S$ function for a compound system is obtained merely by adding 
the $S$ functions of the individual systems, $S_{1,2} = S_1+S_2$.  
(`Compound' does not mean chemical compound; a compound system is just 
a collection of several systems.)  To appreciate this fact it is 
necessary to recognize that the systems comprising a compound system 
can interact with each other in several ways, and therefore the 
possible adiabatic transitions in a compound are far more numerous 
than those allowed for separate, isolated systems.  Nevertheless, the 
increase of the function $S_1+S_2$ continues to describe the adiabatic 
processes exactly---neither allowing more nor allowing less than 
actually occur.  The statement $S_1(X_1)+S_2(X_2)\leq 
S_1(X'_1)+S_2(X'_2)$ does not require $S_1(X_1)\leq S_1(X'_1)$.

The main problem, from our point of view, is this: What properties of 
adiabatic processes permit us to construct such a function?  To what 
extent is it unique?  And what properties of the interactions of 
different systems in a compound system result in additive entropy 
functions?

The existence of an entropy function can be discussed in principle, as 
in Section II, without parametrizing the equilibrium states by 
quantities such as energy, volume, etc..  But it is an additional fact 
that when states are parametrized in the conventional ways then the 
derivatives of $S$ exist and contain all the information about the 
equation of state, e.g., the temperature $T$ is defined by $\partial 
S(U,V)/ \partial U|_V^{\phantom Y} = 1/T$.

In our approach to the second law temperature is never formally 
invoked until the very end when the differentiability of $S$ is 
proved---not even the more primitive relative notions of `hotness' and 
`coldness' are used.  The priority of entropy is common in statistical 
mechanics and in some other approaches to thermodynamics such as in 
(Tisza, 1966) and (Callen, 1985), but the elimination of hotness and 
coldness is not usual in thermodynamics, as the formulations of 
Clausius and Kelvin show.  The laws of thermal equilibrium (Section 
V), in particular the zeroth law of thermodynamics, do play a crucial 
role for us by relating one system to another (and they are ultimately 
responsible for the fact that entropies can be adjusted to be 
additive), but thermal equilibrium is only an equivalence relation 
and, in our form, it is not a statement about hotness.  It seems to us 
that temperature is far from being an `obvious' physical quantity.  It 
emerges, finally, as a derivative of entropy, and unlike quantities in 
mechanics or electromagnetism, such as forces and masses, it is not 
vectorial, i.e., it cannot be added or multiplied by a scalar.  Even 
pressure, while it cannot be `added' in an unambiguous way, can at 
least be multiplied by a scalar.  (Here, we are not speaking about 
changing a temperature scale; we mean that once a scale has been 
fixed, it does not mean very much to multiply a given temperature, 
e.g., the boiling point of water, by the number 17.  Whatever meaning 
one might attach to this is surely not independent of the chosen 
scale.  Indeed, is $T$ the right variable or is it $1/T$?  In 
relativity theory this question has led to an ongoing debate about the 
natural quantity to choose as the fourth component of a four-vector.  
On the other hand, it does mean something unambiguous, to multiply the 
pressure in the boiler by 17.  Mechanics dictates the meaning.)

Another mysterious quantity is `heat'.  No one has ever seen heat, nor 
will it ever be seen, smelled or touched.  Clausius wrote about ``the 
kind of motion we call heat", but thermodynamics---either practical or 
theoretical---does not rely for its validity on the notion of 
molecules jumping around.  There is no way to measure heat flux 
directly (other than by its effect on the source and sink) and, while 
we do not wish to be considered antediluvian, it remains true that 
`caloric' accounts for physics at a macroscopic level just as well as 
`heat' does.  The reader will find no mention of heat in our 
derivation of entropy, except as a mnemonic guide.

To conclude this very brief outline of the main conceptual points, the
concept of {\it convexity} has to be mentioned. It is well known, as
Gibbs (Gibbs 1928), Maxwell and others emphasized, that thermodynamics
without convex functions (e.g., free energy per unit volume as a
function of density) may lead to unstable systems.  (A good discussion
of convexity is in (Wightman, 1979).) Despite this fact, convexity is
almost invisible in most fundamental approaches to the second law.  In
our treatment it is {\it essential} for the description of simple
systems in Section III, which are the building blocks of
thermodynamics.

The concepts and goals we have just enunciated will be discussed in
more detail in the following sections.  The reader who impatiently
wants a quick survey of our results can jump to Section VII where it
can be found in capsule form. We also draw the readers attention to the
article (Lieb-Yngvason 1998), where a summary of this work appeared.
Let us now turn to a brief discussion of other modes of thought about
the questions we have raised.

\bigskip\bigskip\noindent
{\subt B. Other approaches}
\bigskip

The simplest solution to the problem of the foundation of 
thermodynamics is perhaps that of Tisza (1966), and expanded by Callen 
(1985) (see also (Guggenheim, 1933)), who, following the tradition of 
Gibbs (1928), postulate the existence of an additive entropy function 
from which all equilibrium properties of a substance are then to be 
derived.  This approach has the advantage of bringing one quickly to 
the applications of thermodynamics, but it leaves unstated such 
questions as: What physical assumptions are needed in order to insure 
the existence of such a function?  By no means do we wish to minimize 
the importance of this approach, for the manifold implications of 
entropy are well known to be non-trivial and highly important 
theoretically and practically, as Gibbs was one of the first to show 
in detail in his great work (Gibbs, 1928).

Among the many foundational works on the existence of entropy, the 
most relevant for our considerations and aims here are those that we 
might, for want of a better word, call `order theoretical' because the 
emphasis is on the derivation of entropy from postulated properties of 
adiabatic processes.  This line of thought goes back to Carath\'eodory 
(1909 and 1925), although there are some precursors (see Planck, 1926) 
and was particularly advocated by (Born, 1921 and 1964).  This basic 
idea, if not Carath\'eodory's implementation of it with differential 
forms, was developed in various mutations in the works of Landsberg 
(956), Buchdahl (1958, 1960, 1962, 1966), Buchdahl and Greve (1962), 
Falk and Jung (1959), Bernstein (1960), Giles (964), Cooper (1967), 
Boyling, (1968, 1972), Roberts and Luce (1968), Duistermaat (1968), 
Hornix (1968), Rastall (1970), Zeleznik (1975) and Borchers (1981).  
The work of Boyling (1968, 1972), which takes off from the work of 
Bernstein (1960) is perhaps the most direct and rigorous expression of 
the original Carth\'eodory idea of using differential forms.  See also 
the discussion in Landsberg (1970).

Planck (1926) criticized some of Carath\'eodory's work for not
identifying processes that are not adiabatic. He suggested basing
thermodynamics on the fact that `rubbing' is an adiabatic process that
is not reversible, an idea he already had in his 1879 dissertation.
{}From this it follows that while one can undo a rubbing operation by
some means, one cannot do so adiabatically.  We derive  this principle
of Planck from our axioms. It is very convenient because it means that
in an adiabatic process one can effectively add as much `heat'
(colloquially speaking) as one wishes, but the one thing one cannot do
is subtract heat, i.e., use a `refrigerator'.

Most authors introduce the idea of an `empirical temperature', and 
later derive the absolute temperature scale.  In the same vein they 
often also introduce an `empirical entropy' and later derive a 
`metric', or additive, entropy, e.g., (Falk and Jung, 1959) and 
(Buchdahl, 1958, et seq., 1966), (Buchdahl and Greve, 1962), (Cooper, 
1967).  We avoid all this; one of our results, as stated above, is the 
derivation of absolute temperature directly, without ever mentioning 
even `hot' and `cold'.

One of the key concepts that is eventually needed, although it is not 
obvious at first,  is that of the comparison principle (or
hypothesis), (CH). It concerns classes of thermodynamic states and 
asserts that for any two states $X$ and $Y$ within a class one can either
go {\it adiabatically} from $X$ to $Y$, which we write as
$$
X \prec Y,
$$ 
(pronounced ``$X$ precedes $Y$" or ``$Y$ follows $X$") or else one can
go from $Y$ to $X$, i.e., $Y \prec X$.  Obviously, this is not always
possible (we cannot transmute lead into gold, although we {\it can}
transmute hydrogen plus oxygen into water), so we would like to be able
to break up the universe of states into equivalence classes, inside
each of which the hypothesis holds. It turns out that the key
requirement for an equivalence relation is  that if  $X\prec Y$ and
$Z\prec Y$ then  either $X\prec Z$ or  $Z\prec X$.  Likewise, if
$Y\prec X$ and $Y\prec Z$ by then either $X \prec Z$ or $Z\prec X$.
We  find this first clearly stated in Landsberg (1956) and it is also
found in one form or another in many places, see e.g., (Falk and Jung,
1959), (Buchdahl, 1958, 1962), (Giles, 1964).  However, all authors,
except for Duistermaat (1968), seem to take this postulate for granted
and do not feel obliged to obtain it from something else. One of the
central points in our work is to {\it derive} the comparison
hypothesis. This is discussed further below.

The formulation of the second law of thermodynamics that is closest to 
ours is that of Giles (Giles, 1964).  His book is full of deep 
insights and we recommend it highly to the reader.  It is a classic 
that does not appear to be as known and appreciated as it should.  His 
derivation of entropy from a few postulates about adiabatic processes 
is impressive and was the starting point for a number of further 
investigations.  The overlap of our work with Giles's is only partial 
(the foundational parts, mainly those in our section II) and where 
there is overlap there are also differences.

To define the entropy of a state, the starting point in both 
approaches is to let a process that by itself would be adiabatically 
impossible work against another one that is possible, so that the 
total process is adiabatically possible.  The processes used by us and 
by Giles are, however, different; for instance Giles uses a fixed 
external calibrating system, whereas we define the entropy of a state 
by letting a system interact with a copy of itself.  ( According to 
R.\ E.\ Barieau (quoted in (Hornix, 1967-1968)) Giles was unaware of 
the fact that predecessors of the idea of an external entropy meter 
can be discerned in (Lewis and Randall, 1923).)  To be a bit more 
precise, Giles uses a standard process as a reference and counts how 
many times a reference process has to be repeated to counteract some 
multiple of the process whose entropy (or rather `irreversibility') is 
to be determined.  In contrast, we construct the entropy function for 
a single system in terms of the amount of substance in a reference 
state of `high entropy' that can be converted into the state under 
investigation with the help of a reference state of `low entropy'.  
(This is reminiscent of an old definition of heat by Laplace and 
Lavoisier (quoted in (Borchers, 1981)) in terms of the amount of ice 
that a body can melt.)  We give a simple formula for the entropy; 
Giles's definition is less direct, in our view.  However, when we 
calibrate the entropy functions of different systems with each other, 
we do find it convenient to use a third system as a `standard' of 
comparison.

Giles' work and ours use very little of the calculus.  Contrary to 
almost all treatments, and contrary to the assertion 
(Truesdell-Bharata, 1977) that the differential calculus is the 
appropriate tool for thermodynamics, we and he agree that entropy and 
its essential properties can best be described by maximum principles 
instead of equations among derivatives.  To be sure, real analysis 
does eventually come into the discussion, but only at an advanced 
stage (Sections III and V in our treatment).

In Giles, too, temperature appears as a totally derived quantity, but 
Giles's derivation requires some assumptions, such as 
differentiability of the entropy.  We prove the required 
differentiability from natural assumptions about the pressure.

Among the differences, it can be mentioned that the `cancellation 
law', which plays a key role in our proofs, is taken by Giles to be an 
axiom, whereas we derive it from the assumption of `stability', which 
is common to both approaches (see Section II for definitions).

The most important point of contact, however, and at the same time the 
most significant difference, concerns the comparison hypothesis which, 
as we emphasized above, is a concept that plays an essential role, 
although this may not be apparent at first.  This hypothesis serves to 
divide the universe nicely into equivalence classes of mutually 
accessible states.  Giles takes the comparison property as an axiom 
and does not attempt to justify it from physical premises.  The main 
part of our work is devoted to just that justification, and to inquire 
what happens if it is violated.  (There is also a discussion of this 
point in (Giles, 1964, Sect 13.3) in connection with hysteresis.)  To 
get an idea of what is involved, note that we can easily go 
adiabatically from cold hydrogen plus oxygen to hot water and we can 
go from ice to hot water, but can we go either from the cold gases to 
ice or the reverse---as the comparison hypothesis demands?  It would 
appear that the only real possibility, if there is one at all, is to 
invoke hydrolysis to dissociate the ice, but what if hydrolysis did 
not exist?  In other examples the requisite machinery might not be 
available to save the comparison hypothesis.  For this reason we 
prefer to derive it, when needed, from properties of `simple systems' 
and not to invoke it when considering situations involving variable 
composition or particle number, as in Section VI.

Another point of difference is the fact that convexity is central to 
our work.  Giles mentions it, but it is not central in his work 
perhaps because he is considering more general systems than we do.  To 
a large extent convexity eliminates the need for explicit topological 
considerations about state spaces, which otherwise has to be put in 
`by hand'.

Further developments of the Giles' approach are in (Cooper, 1967), 
(Roberts and Luce, 1968) and (Duistermaat, 1968).  Cooper assumes the 
existence of an empirical temperature and introduces topological 
notions which permits certain simplifications.  Roberts and Luce have 
an elegant formulation of the entropy principle, which is 
mathematically appealing and is based on axioms about the order 
relation, $\prec$, (in particular the comparison principle, which they 
call conditional connectedness), but these axioms are not physically 
obvious, especially axiom 6 and the comparison hypothesis.  
Duistermaat is concerned with general statements about morphisms of 
order relations, thermodynamics being but one application.

A line of thought that is entirely different from the above starts 
with Carnot (1824) and was amplified in the classics of Clausius and 
Kelvin (cf.\  (Kestin, 1976)) and many others.  It has dominated most 
textbook presentations of thermodynamics to this day.  The central 
idea concerns cyclic processes and the efficiency of heat engines; 
heat and empirical temperature enter as primitive concepts.  Some of 
the modern developments along these lines go well beyond the study of 
equilibrium states and cyclic processes and use some sophisticated 
mathematical ideas.  A representative list of references is Arens 
(1963), Coleman and Owen (1974, 1977), Coleman, Owen and Serrin 
(1981), Dafermos (1979), Day (1987, 1988), Feinberg and Lavine (1983), 
Green and Naghdi (1978), Gurtin (1975), Man (1989), Owen (1984), 
Pitteri (1982), Serrin (1979, 1983, 1986), Silhavy (1997), Truesdell 
and Bharata (1977), Truesdell (1980, 1984).  Undoubtedly this approach 
is important for the practical analysis of many physical systems, but 
we neither analyze nor take a position on the validity of the claims 
made by its proponents.  Some of these are, quite frankly, highly 
polemical and are of two kinds: claims of mathematical rigor and 
physical exactness on the one hand and assertions that these qualities 
are lacking in other approaches.  See, for example, Truesdell's 
contribution in (Serrin, 1986, Chapter 5).  The chief reason we omit 
discussion of this approach is that it does not directly address the 
questions we have set for ourselves.  Namely, using only the existence 
of equilibrium states and the existence of certain processes that take 
one into another, when can it be said that the list of allowed 
processes is characterized {\it exactly} by the increase of an entropy 
function?

Finally, we  mention an interesting recent paper by Macdonald
(1995) that falls in neither of the two categories described
above. In this paper  \lq heat\rq\ and \lq reversible processes\rq\ are
among the primitive concepts and the existence of reversible processes
linking any two states of a system is taken as a postulate.  Macdonald
gives a simple definition of entropy of a state in terms of the maximal
amount of heat, extracted from an infinite reservoir, that the system
absorbs in processes terminating in the given state. The reservoir thus
plays the role of an entropy meter. The further development of the
theory along these lines, however, relies on unstated assumptions about
differentiability of the so defined entropy that are not entirely
obvious.

%\vfill\eject
\bigskip\noindent
{\subt C. Outline of the paper}
\bigskip

In Section II we formally introduce the relation $\prec$ and explain 
it more fully, but it is to be emphasized, in connection with what was 
said above about an ideal physical theory, that $\prec$ has a well 
defined mathematical meaning independent of the physical context in 
which it may be used.  The concept of an entropy function, which 
characterizes this accessibility relation, is introduced next; at the 
end of the section it will be shown to be unique up to a trivial 
affine transformation of scale.  We show that the existence of such a 
function is {\it equivalent} to certain simple properties of the 
relation $\prec$, which we call axioms A1 to A6 and the `hypothesis' 
CH. Any formulation of thermodynamics must implicitly contain these 
axioms, since they are equivalent to the entropy principle, and it is 
not surprising that they can be found in Giles, for example.  We do 
believe that our presentation has the virtue of directness and 
clarity, however.  We give a simple formula for the entropy, entirely 
in terms of the relation $\prec$ without invoking Carnot cycles or any 
other gedanken experiment.  Axioms A1 to A6 are highly plausible; it 
is CH (the comparison hypothesis) that is not obvious but is {\it 
crucial} for the existence of entropy.  We call it a hypothesis rather 
than an axiom because our ultimate goal is to derive it from some 
additional axioms.  In a certain sense it can be said that the rest of 
the paper is devoted to {\it deriving} the comparison hypothesis from 
plausible assumptions.  The content of Section II, i.e., the 
derivation of an entropy function, stands on its own feet; the 
implementation of it via CH is an independent question and we feel it 
is pedagogically significant to isolate the main input in the 
derivation from the derivation itself.

Section III introduces one of our most novel contributions.  We {\it 
prove } that comparison holds for the states inside certain systems 
which we call {\it simple systems}.  To obtain it we need a few new 
axioms, S1 to S3.  These axioms are mainly about {\it mechanical} 
processes, and not about the entropy.  In short, our most important 
assumptions concern the continuity of the generalized pressure and the 
existence of irreversible processes.  Given the other axioms, the 
latter is equivalent to Carath\'eodory's principle.

The comparison hypothesis, CH, does not concern simple systems alone, 
but also their products, i.e., compound systems composed of possibly 
interacting simple systems.  In order to compare states in different 
simple systems (and, in particular, to calibrate the various entropies 
so that they can be added together) the notion of a {\it thermal join} 
is introduced in Section IV. This concerns states that are usually 
said to be in thermal equilibrium, but we emphasize that temperature 
is not mentioned.  The thermal join is, by assumption, a simple system 
and, using the zeroth law and three other axioms about the thermal 
join, we reduce the comparison hypothesis among states of {\it 
compound systems} to the previously derived result for simple systems.  
This derivation is another novel contribution.  With the aid of the 
thermal join we can prove that the multiplicative constants of the 
entropies of all systems can be chosen so that entropy is additive, 
i.e., the sum of the entropies of simple systems gives a correct 
entropy function for compound systems.  This entropy correctly 
describes all adiabatic processes in which there is no change of the 
constituents of compound systems.  What remains elusive are the 
additive constants, discussed in Section VI. These are important when 
changes (due to mixing and chemical reactions) occur.

Section V establishes the continuous differentiability of the
entropy and defines inverse temperature as the derivative of the entropy
with respect to the energy---in the usual way. No new assumptions are
needed here. The fact that the entropy of a simple system 
is determined uniquely by its adiabats and
isotherms is also proved here.

In Section VI we discuss  the vexed question of comparing
states of systems that differ in constitution or in quantity of matter.
How can the entropy of a bottle of water be compared with  the sum of
the entropies of a container of hydrogen and a container of oxygen? To
do so requires being able to transform one into the other, but this may
not be easy to do reversibly. The usual theoretical underpinning here
is the use of semi-permeable membranes in a `van't Hoff box' but such
membranes are usually far from perfect physical objects, if they exist
at all. We examine in detail just how far one can go in
determining the {\it additive} constants for the entropies of different
systems in the the real world in which perfect semi-permeable 
membranes do not exist.

In Section VII we collect all our axioms together and
summarize our results briefly.
%\vfill\eject

\bigskip\noindent
{\subt D. Acknowledgements}
\bigskip



We are deeply indebted to Jan Philip Solovej for many useful discussions
and important insights, especially in regard to Sections III and VI. 
Our thanks also go to Fredrick Almgren for helping us understand convex
functions, to Roy Jackson, Pierluigi Contucci, Thor Bak and Bernhard
Baumgartner for critically reading our manuscript and to Martin Kruskal
for emphasizing the importance of Giles' book to us.  We thank Robin 
Giles for a thoughtful and detailed review with many helpful
comments. We thank John C. Wheeler for a clarifying correspondence about
the relationship  between adiabatic processes, as usually understood,
and our definition of adiabatic accessibility.  Some of the rough spots
in our story were pointed out to us by various people during various
public lectures we gave, and that is also very much appreciated.


A significant part of this work was carried out at Nordita in 
Copenhagen and at the Erwin Schr\"odinger Institute in Vienna;  we are
grateful for their hospitality and support.




%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\vfill\eject
%%%%%%%%%%%%%%%%%%%%
\noindent
\leftline {\tit II. ADIABATIC ACCESSIBILITY } \smallskip
\leftline {\tit \phantom{Ix}\enspace AND CONSTRUCTION OF ENTROPY }
\bigskip



Thermodynamics concerns systems, their states and an order relation
among these states.  The order relation is that of {\bf adiabatic
accessibility}, which, physically, is defined by processes whose only
net effect on the surroundings is exchange of energy with a mechanical
source.  The glory of classical thermodynamics is that there always is
an {\it additive}   function, called {\bf entropy}, on the state space
of any system, that {\it exactly} describes the order relation in terms
of the increase of entropy.

Additivity is very important physically and is certainly not obvious;
it tells us that the entropy of a compound system composed of two
systems that can interact and exchange energy with each other is the
sum of the individual entropies. This means that the pairs of states
accessible from a given pair of states, which is a far larger set than
merely the pairs individually accessible by the systems in isolation,
is given by studying the sum of the individual entropy functions. This
is even more surprising when we consider  that the individual entropies
each have undetermined multiplicative constants; there is a way to
adjust, or calibrate the constants in such a way that the sum gives the
correct result for the accessible states---and  this can be done once
and for all so that the same calibration works for all possible pairs
of systems.  Were additivity to fail we would have to rewrite the steam
tables every time a new steam engine is invented.

The other important point about entropy, which is often overlooked, is
that entropy not only increases, but entropy also tells us exactly which
processes are adiabatically possible in any given system; states of high
entropy in a system are {\it always } accessible from  states of lower
entropy. As we shall see this is generally true but it could conceivably
fail when there are chemical reactions or mixing, as discussed in
Section VI.  

In this section we begin by defining these basic concepts more precisely,
and then we present the entropy principle.  Next, we introduce certain
axioms, A1-A6, relating the concepts.  All these axioms are completely
intuitive. However, one other assumption---which we call the {\it
comparison hypothesis}---is needed for the construction of entropy.  It 
is
not at all obvious physically, but it is an essential part of 
conventional
thermodynamics.  Eventually, in Sections III and IV, this hypothesis will
be {\it derived} from some more detailed physical considerations. For the
present, however, this hypothesis will be assumed  and, using it, the
existence of an entropy function will be proved. We  also discuss  the 
extent to which the entropy function is uniquely determined by the order
relation; the comparison hypothesis plays a key role here.

The existence of an entropy function is equivalent to axioms A1-A6 in
conjunction with CH, neither more nor less is required.  The state
space need not have any structure besides the one implied by the order
relation.  However, state spaces parametrized by the energy and work
coordinates have an additional, convex structure, which implies
concavity of the entropy, provided that the formation of convex
combination of states is an adiabatic process.  We add this requirement
as axiom A7 to our list of general axioms about the order relation.

The axioms in this section are so general that they  encompass
situations where {\it all} states in a whole neighborhood of a given
state are adiabatically accessible from it. {\bf Carath\'eodory's
principle} is the statement that this does {\it not} happen for
physical thermodynamic systems. In contrast, ideal mechanical systems
have the property that every state is accessible from every other one
(by mechanical means alone), and thus the world of mechanical systems
will trivially obey the entropy principle in the sense that every state
has the same entropy.  In the last subsection we discuss the connection
between Carath\'eodory's principle and the existence of irreversible
processes starting from a given state.  This principle will again be
invoked when, in Section III, we derive the comparison hypothesis for
simple thermodynamic systems.

Temperature will not be used in this section, not even the notion of
`hot' and `cold'. There will be no cycles, Carnot or otherwise.  The
entropy only depends on, and is defined by the order relation. Thus,
while the approach given here is  not the only path to the second law,
it has the advantage of a certain simplicity and clarity that at least
has pedagogic and conceptual value.  We ask the reader's
patience with our syllogisms, the point being that everything is here
clearly spread out in full view. There are no hidden assumptions, as
often occur in many textbook presentations. 

Finally, we hope that the reader will not be confused by our sometimes
lengthy asides about the motivation and heuristic meaning of our
various definitions and theorems. We also hope these remarks will not
be construed as part of the structure of the second law. The
definitions and theorems are self-contained, as we state them, and the
remarks that surround them are intended only as a helpful guide.


\bigskip\noindent
{\subt A. Basic concepts }
\bigskip

\noindent
{\subsubt 1. Systems and their state spaces}  
\medskip

Physically speaking a thermodynamic {\it system} consists of certain
specified amounts of different kinds of matter; it might be divisible
into parts that can interact with each other in a specified way. A
special class of systems called simple systems will be discussed in the
next chapter.  In any case the possible interaction of the system with
its surroundings is specified.  It is a ``black box" in the sense that
we do not need to know what is in the box, but only its response to
exchanging energy, volume, etc.  with other systems.  The states of a
system to be considered here are {\it always}  equilibrium states, but
the equilibrium may depend upon the existence of internal barriers in
the system.  Intermediate, non-equilibrium states that a system passes
through when changing from one equilibrium state to another will not be
considered.  The entropy of a system not in equilibrium may, like the
temperature of such a system, have a meaning as an approximate and
useful concept, but this is not our concern in this treatment.

Our systems can be quite complicated and the outside world can act on
them in several ways, e.g., by changing the volume and magnetization,
or removing barriers.  Indeed, we are allowed to chop a system into
pieces violently and reassemble them in several ways, each time waiting
for the eventual establishment of equilibrium.

Our systems must be macroscopic, i.e, not too small. Tiny systems
(atoms, molecules, DNA) exist, to be sure, but we cannot  describe
their equilibria thermodynamically, i.e., their equilibrium states
cannot be described in terms of the simple coordinates we use later
on.  There is a gradual shift from tiny systems to macroscopic ones,
and the empirical fact is that  large enough systems conform to the
axioms given below.  At some stage a system becomes `macroscopic'; we
do not attempt to explain  this phenomenon or to give an exact rule
about which systems are `macroscopic'.

On the other hand, systems that are too large are also ruled out
because gravitational forces become important.  Two suns cannot unite
to form one bigger sun with the same properties (the way two glasses of
water can unite to become one large glass of water). A  star with two
solar masses is intrinsically different from a sun of one solar mass.
In principle,  the two suns  could be kept apart and regarded as one
system, but then this would only be a `constrained' equilibrium because
of the gravitational attraction.  In other words the conventional
notions of `extensivity' and `intensivity' fail for cosmic bodies.
Nevertheless, it is possible to define an entropy for such systems by
measuring its effect on some standard body.  Giles' method is
applicable, and our formula (2.20) in Section II.E (which, in the
context of our development, is used only for calibrating the entropies
defined by (2.14) in Section II.D, but which could be taken as an
independent definition) would allow it, too.  (The `nice' systems that
do satisfy size-scaling are called `perfect' by Giles.) The entropy, so
defined, would satisfy additivity but not extensivity, in the `entropy
principle' of Section II.B.  However, to prove this would requires a
significant enhancement of the basic axioms.  In particular, we  would
have to take the comparison hypothesis, CH, for all systems as an axiom
--- as Giles does.  It is left to the interested reader to carry out
such an extension of our scheme.


A basic operation is {\bf composition} of two or more systems to form a
new system.  Physically, this simply means putting the individual
systems side by side and regarding them as one system.  We then speak
of each system in the union as a {\bf subsystem}.  The subsystems may
or may not interact for a while, by exchanging heat or volume for
instance, but the important point is that a state of the total system
(when in equilibrium) is described completely by the states of the
subsystems.


{}From the mathematical point of view a system is just a collection of
points called a {\bf state space}, usually denoted by $\Gamma$. The
individual points of a state space are called {\bf states} and are
denoted here by capital Roman letters, $X, Y, Z, $ etc.  {}From the
next section on we shall build up our collection of states satisfying
our axioms from the states of certain special systems, called {\it
simple systems}.  (To jump ahead for the moment, these are systems with
one or more work coordinates but with only one energy coordinate.)  In
the present section, however, the manner in which states are described
(i.e., the coordinates one uses, such as energy and volume, etc.)  are
of no importance.  Not even topological properties are assumed here
about our systems, as is often done.  In a sense it is amazing that
much of the second law follows from certain abstract properties of the
relation among states, independent of physical details (and hence of
concepts such as Carnot cycles). In approaches like Giles', where it is
taken as an axiom that comparable states fall into equivalence classes,
it is even possible to do without the system concept altogether, or
define it simply as an equivalence class of states. In our approach,
however,  one of the main goals is to derive the property which Giles
takes as an axiom, and systems are basic objects in our axiomatic
scheme.



Mathematically, the composition of two spaces, $\Gamma_1 $ and
$\Gamma_2 $ is simply the Cartesian product of the state spaces 
$\Gamma_1 \times \Gamma_2$.  In other words, the states in $\Gamma_1 
\times 
\Gamma_2 $ are pairs
$(X_1,X_2)$ with $X_1 \in \Gamma_1 $ and $X_2 \in \Gamma_2 $.  {}From 
the physical interpretation of the composition it is clear that the two
spaces $\Gamma_1 \times \Gamma_2 $ and $\Gamma_2 \times \Gamma_1$
are to be identified. Likewise, when forming multiple compositions of 
state spaces, the order and the grouping of the spaces is immaterial. 
Thus $(\Gamma_1 \times \Gamma_2)\times \Gamma_3$,  
$\Gamma_1 \times (\Gamma_2\times \Gamma_3)$ and 
$\Gamma_1 \times \Gamma_2\times \Gamma_3$ are to be identified as far 
as composition of state spaces is concerned. Strictly speaking, a 
symbol like $(X_1,\dots , X_{N})$  with states $X_{i}$ in state 
spaces $\Gamma_{i}$, $i=1,\dots,N$ thus stands for an equivalence 
class of $n$-tuples, corresponding to the different groupings and 
permutations of the state spaces. Identifications of this type are 
not uncommon in mathematics (the formation of direct sums of vector 
spaces is an example).



A further operation we shall assume is the formation of {\bf scaled
copies} of a given system whose state space is $\Gamma$.  
If $t>0$ is some fixed number (the scaling parameter) the state space
$\Gamma^{(t)}$ consists of points denoted $tX$ with $X\in \Gamma$. On the
abstract level $tX$ is merely a symbol, or mnemonic, to define points in
$\Gamma^{(t)}$, but the symbol acquires meaning through the axioms given
later in Sect.\ II.C. In the physical world, and from Sect.\ III onward, 
the state spaces will  always be subsets of some $\R^n$ (parametrized by
energy, volume, etc.). In this case $tX$ has the concrete 
representation as the product of the real number $t$ and the vector 
$X\in\R^n$. Thus in this case $\Gamma^{(t)}$ is simply the image of 
the set 
$\Gamma\subset \R^n$ under scaling by the real parameter $t$.
Hence, we shall sometimes denote $\Gamma^{(t)}$ by $t\Gamma$.

Physically, $\Gamma^{(t)}$ is interpreted as the state space of a system
that has the same  properties as the system with state space $\Gamma$,
except that the amount of each chemical substance in the system has been
scaled by the factor $t$ and the range of extensive variables like 
energy,
volume etc. has been scaled accordingly.  Likewise, $tX$ is obtained from
$X$ by scaling energy, volume etc., but also the matter content of a 
state
$X$  is scaled by the parameter $t$.  
{}From this physical interpretation it is clear
that $s(tX)=(st)X$ and ${(\Gamma^{(t)})}^{(s)}=\Gamma^{(st)}$ and we take
these relations also for granted on the abstract level. The same 
apples to the identifications $\Gamma^{(1)}=\Gamma$ and $1X=X$, and 
also
$(\Gamma_{1}\times\Gamma_{2})^{(t)}=\Gamma_{1}^{(t)}
\times\Gamma_{2}^{(t)}$ and $t(X,Y)=(tX,tY)$.


The operation of forming compound states is thus an associative and 
commutative binary operation on the set of all states, and the group 
of positive real numbers acts by the scaling operation on this set in 
a way compatible with the binary operation and the multiplicative 
structure of the real numbers.  The same is true for the set of all 
state spaces.  {}From an algebraic point of view the simple systems, 
to be discussed in Section III, are a basis for this algebraic 
structure.

While the relation between $\Gamma$ and $\Gamma^{(t)}$ is physically
and intuitively fairly obvious, there can be surprises. Electromagnetic
radiation in a cavity (`photon gas'), which is mentioned after (2.6),
is an interesting case; the two state spaces $\Gamma$ and
$\Gamma^{(t)}$ and the thermodynamic functions on these spaces are
identical in this case! Moreover, the two spaces are physically
indistinguishable. This will be explained in more detail in Section
II.B.


The formation of scaled copies involves a certain physical 
idealization because it ignores the molecular structure of matter.  
Scaling to arbitrarily small sizes brings quantum effects to the fore 
and macroscopic thermodynamics is no longer applicable.  At the other 
extreme, scaling to arbitrarily large sizes brings in unwanted 
gravitational effects as discussed above.  In spite of these well 
known limitations the idealization of continuous scaling is common 
practice in thermodynamics and simplifies things considerably.  (In 
the statistical mechanics literature this goes under the rubric of the 
\lq thermodynamic limit\rq.)  It should be noted that scaling is quite 
compatible with the inclusion of `surface effects' in thermodynamics.  
This will be discussed in Section III. A.




By composing scaled copies of $N$ systems with state spaces $\Gamma_1,
\dots , \Gamma_N$, one can form, for $t_1,\dots,t_N>0$, their {\bf scaled
product} $\Gamma^{(t_1)}_1 \times \cdots \times \Gamma^{(t_N)}_N$ whose 
points are $(t_1 X_1, t_2 X_2, \dots , t_N X_N)$. 
In the particular case that the  $\Gamma_j$'s are identical, i.e.,
$\Gamma_1= \Gamma_2 = \cdots =\Gamma$, we shall call any space  of the
the form $\Gamma^{(t_1)} \times \cdots \times \Gamma^{(t_N)}$ a {\bf
multiple scaled copy} 
of $\Gamma$.
As will be explained later in connection with Eq.\ (2.11), it is 
sometimes 
convenient in calculations to allow $t=0$ as scaling parameter (and even 
negative values). For the moment let us just note that if $\Gamma^{(0)}$ 
occurs 
the reader is asked to regard it as the empty set or 'nosystem'. In other 
words, 
ignore it. 

\smallskip


Some examples may help clarify  the concepts of systems and state
spaces. \smallskip

\item{(a)} $\Gamma_a$: 1 mole of hydrogen, H$_2$. The state space can
be identified with a subset of $\R^2$ with coordinates $U$ ($=$
energy), $V (=$ volume).

\item{(b)} $\Gamma_b$: $\mfr1/2$ mole of H$_2$.  If $\Gamma_a$ and
$\Gamma_b$ are regarded as subsets of $\R^2$ then  $\Gamma_b =
\Gamma_a^{(1/2)} = \{(\mfr1/2 U,\mfr1/2 V) : (U,V) \in \Gamma_a \}$.

\item{(c)} $\Gamma_c$: 1 mole of H$_2$ and $\mfr1/2$ mole of O$_2$
(unmixed). $\Gamma_c = \Gamma_a \times \Gamma_{(\mfr1/2 \ {\rm mole \
O}_2)}$. This is a compound system.

\item{(d)} $\Gamma_d$: 1 mole of H$_2$O.

\item{(e)} $\Gamma_e$: 1 mole of H$_2 + \mfr1/2$ mole of O$_2$ (mixed).
Note that 
$\Gamma_e \not= \Gamma_d$ and $\Gamma_e \not= \Gamma_c$. This system
shows the perils inherent in the concept of equilibrium. The system
$\Gamma_e$ makes sense as long as one does not drop in a piece of
platinum or walk across the laboratory floor  too briskly. Real world
thermodynamics requires that we admit such quasi-equilibrium systems,
although perhaps not quite as dramatic as this one.

\item{(f)} $\Gamma_f$: All the  equilibrium states of one mole of H$_2$
and half a mole of O$_2$ (plus a tiny bit of platinum to speed up the
reactions) in a container.  A typical state will have some fraction of
H$_2$O, some fraction of H$_2$ and some O$_2$. Moreover, these
fractions can exist in several phases.

\bigskip\noindent
{\subsubt 2.  The order relation}  
\medskip

The basic ingredient of thermodynamics is the relation 
$$
\prec
$$ 
of {\bf adiabatic accessibility} among  states of a system--- or even
different systems.  The statement $X\prec Y$, when $X$ and $Y$ are
points in some (possibly different) state spaces, means that there is
an adiabatic transition, in the sense explained below, that takes the
point $X$ into the point $Y$.

Mathematically, we do not have to ask the meaning of \lq adiabatic\rq.
All  that matters is that a list of all possible pairs of states $X$'s
and $Y$'s such that $X \prec Y$ is regarded as given.  This list has to
satisfy certain axioms that we prescribe below in subsection C. Among
other things it must be reflexive, i.e., $X\prec X$, and transitive,
i.e., $X\prec Y$ and $Y\prec Z$ implies $X\prec Z$.  (Technically, in
standard mathematical terminology this is called a {\it pre}order
relation because we can have both $X\prec Y$ and $Y\prec X$ without
$X=Y$.) Of course, in order to have an interesting thermodynamics
result from our  $\prec$ relation it is essential that there are pairs
of points $X,Y$ for which $X\prec Y$ is {\it not} true.


Although the physical interpretation of the relation $\prec$ is
not needed for the mathematical development, for applications it
is essential to have a clear understanding of its meaning. It is
difficult to avoid some circularity when defining the concept of
adiabatic accessibility.
The following version (which is in the spirit of Planck's formulation
of the second law (Planck, 1926)) appears to be sufficiently general
and precise and appeals to us. It has the great virtue (as discovered
by Planck) that it avoids having to distinguish between work and
heat---or even having to define the concept of heat; heat, in the
intuitive sense, can always be generated by rubbing---in accordance
with Count Rumford's famous discovery while boring cannons! We
emphasize, however, that other definitions are certainly possible.  Our
physical definition is the following:  \medskip

{\bf  Adiabatic accessibility:} {\it A state $Y$ is adiabatically
accessible from a state $X$, in symbols $X\prec Y$, if it is possible
to change the state from $X$ to $Y$ by means of an interaction with
some device (which may consist of mechanical and electrical parts as
well as auxiliary thermodynamic systems) and a weight, in such a way
that the device returns to its initial state at the end of the process
whereas the weight may have changed its position in a gravitational
field.}

Let us write
$$
X\prec \prec Y \ \ \  {\rm if}\ \ \  X\prec Y \ \ \ {\rm but}\ \ \
 Y\not\prec X . \eqno(2.1)
$$
In the real world $Y$ is adiabatically accessible from $X$ only
if $X\prec \prec Y$. When $X\prec Y$ and also 
$Y\prec X$ 
then the state change can only be realized in an idealized sense, for
it will take infinitely long time to achieve it in the manner
decribed.  An alternative way is to say that the \lq device\rq\ that
appears in the definition of accessibility has to return to within
\lq$\varepsilon$\rq\ of its original state (whatever that may mean) and
we take the limit $\varepsilon \to 0$. To avoid this kind of discussion
we have taken the definition as given above, but we emphasize that it
is certainly possible to redo the whole theory using only the notion of
$\prec \prec $. An emphasis on $\prec \prec $ appears in Lewis and
Randall's discussion of the second law (Lewis and Randall, 1923, page
116).


{\it Remark:} It should be noted that the operational definition above is 
a 
definition of the concept of
`adiabatic accessibility' and not the concept of an `adiabatic
process'. A state change leading from $X$ to $Y$ can be achieved in
many different ways (usually infinitely many), and not all of them will
be `adiabatic processes' in the usual terminology. Our concern is not
the temporal development of the state change which, in real processes,
always leads out of the space of equilibrium states.
Only the end result for the system and for the rest of the world
interests us. However,  it is important to clarify the relation between
our definition of adiabatic accessiblity and the usual textbook
definition of an adiabatic process. This will be discussed in Section C
after Theorem 2.1 and again in Sec. III; cf. Theorem 3.8. There it will
be shown that our definition indeed coincides with the usual notion
based on processes taking place within an 'adiabatic enclosure'.
A further point to notice is that the word \lq adiabatic\rq\ is
sometimes used to mean ``slow" or quasi-static, but nothing of the sort
is meant here. Indeed, an adiabatic process can be quite violent. The
explosion of a bomb in a closed container is an adiabatic process.




\smallskip

Here are some further examples of adiabatic  processes: \smallskip

\item{1.} Expansion or compression of a gas, with or without the help
of a weight being raised or lowered.

\item{2.} Rubbing or stirring.

\item{3.} Electrical heating. (Note that the concept of `heat' is not
needed here.)

\item{4.} Natural processes that occur within an isolated compound 
system after some barriers
have been removed. This includes  mixing and chemical or nuclear 
processes.

\item{5.} Breaking a system into pieces with a hammer and 
reassembling (Fig. 1).

\item{6.} Combinations of such changes.

In the usual parlance, rubbing would be an adiabatic process, but not
electrical `heating', because the latter requires the introduction of a
pair of wires through the `adiabatic enclosure'. For us, both processes
are adiabatic because what is required is that apart from the
change of the system itself, nothing more than
the displacement of a weight occurs. To achieve electrical heating, one
drills a hole in the container, passes a heater wire through it,
connects the wires to a generator which, in turn, is connected to a
weight. After the heating the generator is removed along with the wires,
the hole is plugged, and the system is observed to be in a new state. The
generator, etc. is in its old state and the weight is lower.


\centerline{\sevenpoint ---- (Insert Figure 1 here) ----}
%\epsfxsize 15truecm
%\epsfysize 7.5truecm 
%\epsffile{figure1.eps}


We shall use the following terminology concerning any two states $X$
and $Y$. These states are said to be {\bf comparable} (with respect to
the relation $\prec$, of course) if either $X \prec Y$ or $Y\prec X$.
If both relations hold we say that $X$ and $Y$ are {\bf adiabatically
equivalent} and write
$$
X\sima Y . \eqno(2.2)
$$
The comparison hypothesis referred to above is the statement that any
two states in the {\it same} state space are comparable.  In  the
examples of systems (a) to (f) above, all satisfy the comparison
hypothesis. Moreover, every point in $\Gamma_c$ is in the relation
$\prec$ to many (but not all) points in $\Gamma_d$. States in different
systems may or may not be comparable.  An example of non-comparable
systems is one mole of H$_2$ and one mole of O$_2$.
Another is one mole of H$_2$ and two moles of H$_2$.


One might think that if the comparison hypothesis, which will be
discussed further in Sects. II.C and II.E, were to fail for some state
space then the situation could easily be remedied by breaking up the
state space into smaller pieces inside each of which the hypothesis
holds.  This, generally, is false. What is needed to accomplish  this
is the extra requirement that {\it comparability is an equivalence
relation;} this, in turn, amounts to saying that the condition $X \prec
Z$ and $Y \prec Z$ implies that $X$ and $Y$ are comparable and,
likewise, the condition $Z \prec X$ and $Z \prec Y$ implies that $X$
and $Y$ are comparable.   (This axiom can be found in (Giles, 1964),
see axiom 2.1.2, and similar requirements were made earlier by Landsberg 
(1956),
Falk and Jung (1959) and Buchdahl (1962, 1966).) While these two 
conditions 
are
logically independent, they can be shown to be equivalent if the axiom
A3 in Section II. C is adopted. In any case, we do not adopt the
comparison hypothesis as an axiom because we find it hard to regard it
as a physical necessity. In the same vein, we do not assume that
comparability is an equivalence relation (which would then lead to the
validity of the comparison hypothesis for suitably defined
subsystems).  Our goal is to prove the comparison hypothesis starting
from axioms that we find more appealing physically.



\bigskip\noindent
{\subt B. The entropy principle}
\bigskip

Given the relation $\prec$ for all possible states of all possible
systems, we can ask whether this relation can be encoded in an entropy
function according to the following principle, which expresses the {\bf 
second 
law of thermodynamics} in a precise and quantitative way:

{\bf Entropy principle:} {\it There is a real-valued function on all
states of all systems (including compound systems), called {\bf entropy}
and denoted by $S$ such that

\item{a)} \underbar{{\tt Monotonicity:}} When $X$ and $Y$ are
comparable states then
$$
X\prec Y \hbox{ \ \ {\rm if and only if} \ \ } S(X) \leq S(Y) . 
\eqno(2.3)
$$
(See (2.6) below.)
\item{b)} \underbar{{\tt Additivity and extensivity:}} If $X$ and $Y$
are states of some (possibly different) systems and if $(X,Y)$ denotes
the corresponding state in the composition of the two systems,  then
the entropy is additive for these states, i.e.,
$$
S((X,Y)) = S(X) + S(Y) . \eqno(2.4)
$$
$S$ is also extensive, i.e.,  for each $t>0$ and each
state $X$ and its scaled copy $tX$,
$$
S(t X)=t S(X) .\eqno(2.5)
$$}

\noindent[Note: {}From now on we shall omit the double parenthesis and
write simply $S(X,Y)$ in place of $S((X,Y))$.]

A logically equivalent formulation of (2.3), that does not use
the word \lq comparable\rq\ is the following pair of statements:
$$
\eqalignno{
X\sima Y &\Longrightarrow S(X) = S(Y) \ \ \ \ {\rm and} \cr
X\prec\prec Y &\Longrightarrow S(X) < S(Y).& (2.6)\cr } 
$$
The last line is especially noteworthy. It says that entropy must
increase in an irreversible process. 


Our goal  is to construct an entropy function that satisfies the
criteria (2.3)-(2,5), and to show that it is essentially unique. We
shall proceed in stages, the first being to construct an entropy
function for a  single system, $\Gamma$,  and its multiple scaled
copies (in which comparability is assumed to hold). Having done this,
the problem of relating different systems will then arise, i.e., the
comparison question for compound systems. In the present Section II
(and {\it only} in this section) we shall simply complete the project
by {\it assuming} what we need by way of comparability. In Section IV,
the thermal axioms (the {\it zeroth law of thermodynamics}, in
particular) will be invoked to verify our assumptions about
comparability in compound systems.  In the remainder of this subsection
we discuss he significance of conditions (2.3)-(2.5).

The physical content of (2.3) was already commented on; adiabatic
processes not only increase entropy but an increase of entropy also
dictates which adiabatic processes are possible (between comparable
states, of course).


The content of additivity, (2.4), is considerably more far reaching
than one might think from the simplicity of the notation---as
we mentioned earlier. Consider four
states $X,X',Y,Y'$ and suppose that $X\prec Y$ and $X'\prec Y'$. Then
(and this will be one of our axioms) $(X,X')\prec (Y,Y')$,  and (2.4)
contains nothing new in this case. On the other hand,  the compound
system can well have an adiabatic process in which $(X,X')\prec (Y,Y')$
but $X\not\prec Y$.  In this case, (2.4) conveys much
information. Indeed, by monotonicity, there will be many cases of this
kind because the inequality $S(X) + S(X') \leq S(Y) + S(Y')$ certainly
does not imply that $S(X) \leq S(Y)$. The fact that the inequality
$S(X) + S(X') \leq S(Y) + S(Y')$ tells  us {\it exactly } which
adiabatic processes are allowed in the compound system (assuming
comparability), independent of any detailed knowledge of the manner in
which the two systems interact, is astonishing and is at the 
{\it heart of thermodynamics.}

Extensivity, (2.5), is {\it almost} a consequence of (2.4) alone---but 
logically it is independent.  Indeed, (2.4) implies that (2.5) holds 
for {\it rational} numbers $t$ provided one accepts the notion of
recombination as given in Axiom A5 below, i.e., one can combine
two samples of a system in the same state into a bigger system in a
state with the same intensive properties. (For systems, such as cosmic
bodies, that  do not obey this axiom, extensivity and additivity are
truly independent concepts.) On the other hand, using the
axiom of choice, one may always change a given entropy function
satisfying (2.3) and (2.4) in such a way that (2.5) is violated for
some irrational $t$, but then the function $t\mapsto S(tX)$ would end
up being unbounded in every $t$ interval.  Such pathological cases
could be excluded by supplementing (2.3) and (2.4) with the requirement
that $S(t X)$ should locally be a bounded function of $t$, either from
below or above.  This requirement, plus (2.4), would then imply (2.5).
For a discussion related to this point see (Giles, 1964), who
effectively considers {\it only} rational $t$.  See also (Hardy,
Littlewood, Polya 1934) for a discussion of the concept of Hamel bases
which is relevant in this context.

The extensivity condition can sometimes have surprising results, as in
the case of electromagnetic radiation (the `photon gas').  As is well
known (Landau and Lifschitz, 1969, Sect. 60), the phase space of such a
gas (which we imagine to reside in a box with a piston that can be used
to change the volume) is the quadrant $\Gamma=\{(U, V) \ : \
0<U<\infty,\  0<V<\infty \}$. Thus, 
$$
\Gamma^{(t)} =\Gamma
$$
as {\it sets}, which is not surprising or even exceptional. What is 
exceptional is that $S_{\Gamma}$, which gives the entropy of the states
in $\Gamma$, satisfies
$$
S_\Gamma(U,V) = \hbox{\rm (const.) } V^{1/4} U^{3/4}.
$$
It is homogeneous of first degree in the coordinates and, therefore,
the extensivity law tells us that the entropy function on 
the scaled copy $\Gamma^{(t)}$ is
$$
S_{\Gamma^{(t)}} (U, V) = t S_\Gamma (U/t, V/t)= S_\Gamma (U, V).
$$
Thus, all the thermodynamic functions on the two state spaces are the
same! This unusual situation could, in principle, happen for an ordinary
material system, but we know of no  example besides the photon gas.
Here, the result can be traced to the fact that particle number is not
conserved, as it is for material systems, but it does show that one
should not jump to conclusions. There is, however, a further conceptual
point about the photon gas which is physical rather than mathematical.
If a material system had a homogeneous entropy (e.g., $ S(U,V)=
{\rm (const.) } V^{1/2} U^{1/2}$ )we should still be able to distinguish
$\Gamma^{(t)}$ from $\Gamma$, even though the coordinates and entropy
were indistinguishable. This could be done by weighing the two systems
and finding out that one weighs $t$ times as much as the other.
But the photon gas is different: no experiment can tell the two apart.
However, weight {\it per se} plays no role in thermodynamics, so the 
difference between the material and photon systems is not
thermodynamically significant. 

There are two points of view one could take about this anomalous
situation. One is to continue to use the state spaces $\Gamma^{(t)}$,
even though they happen to represent identical systems.  This is not
really a problem because no one said that $\Gamma^{(t)}$ had to be
different from $\Gamma$. The only concern is to check the axioms, and in
this regard there is no problem. We could even allow the additive entropy
constant to depend on $t$, provided it satisfies the extensivity
condition (2.5). The second point of view is to say that there is only
one $\Gamma$ and no $\Gamma^{(t)}$'s at all. This would cause us to
consider the photon gas as outside our formalism and to require special
handling  from time to time. The first alternative is more attractive to
us for obvious reasons. The photon gas will be mentioned again in
connection with Theorem 2.5.



%The possibility of deriving the  entropy principle entirely from
%abstract properties of the order relation $\prec$ was apparently first
%realized by Giles (1964).  For variants of his axioms and related work
%see (Cooper, 1967), (Roberts and Luce, 1968), (Duistermaat, 1968) and
%(Hornix, 1970).  Earlier treatments of Landsberg (1956), 
%Falk and Jung (1959) and
%Buchdahl (1962, 1966) are also based on considerations of this order
%relation and some of the ideas involved appear already in the book of
%(Lewis and Randall, 1923).  Our approach differs from these works in
%several respects, the most important being the emphasis given to the
%comparison hypothesis and its derivation from other properties,
%including convexity arguments.  Also, using the concept of scaled
%copies of a system, we express the entropy directly in terms of the
%order relation by a simple formula (2.14) that does not appear in the
%other approaches. This formula involves only the system under
%consideration, and its scaled copies, and does not invoke an external,
%fixed system that serves as an \lq entropy-meter\rq\ ---as in the
%approach of Giles, for instance. The use of an entropy-meter has some
%relative advantages and disadvantages, and we do not maintain that more
%than a matter of taste is involved at this point. In fact, we shall find 
%it 
%convenient when comparing the entropies of different systems, to take 
%one standard system and use it as a reference to calibrate the entropy 
%scales. The truly important
%differences with previous work are in Sections III--VI, where we derive
%the comparison hypothesis and temperature.

\bigskip\noindent
{\subt C. Assumptions about the order relation}
\bigskip
We now list our assumptions for the order relation $\prec$. As always,
$X$, $Y$, etc. will denote states (that may belong to different
systems), and if $X$ is a state in some state space $\Gamma$, then $tX$
with $t>0$ is the corresponding state in the scaled state space
$\Gamma^{(t)}$.

\item{{\bf A1)}}  {\bf Reflexivity.} $X \sima X$.

\item{{\bf A2)}}  {\bf Transitivity.} {\it $X \prec Y$ and $Y \prec Z$
implies $X \prec Z$.}

\item{{\bf A3)}} {\bf Consistency.} {\it $X \prec X^\prime$ and $Y 
\prec Y^\prime$ implies $(X,Y) \prec
(X^\prime, Y^\prime)$.}

\item{{\bf A4)}} {\bf Scaling invariance.} {\it If $X\prec Y$, then 
$tX \prec tY$ for all $t>0$.}

\item{{\bf A5)}}  {\bf Splitting and recombination.} {\it  For $0 < 
t < 1$
$$X \sima (t X, (1-t) X). \eqno(2.7)$$}
(If $X \in \Gamma$, then the right side is in the scaled product 
$\Gamma^{(t)}\times \Gamma^{(1-t)}$, of course.)

\item{{\bf A6)}}  {\bf Stability.}  {\it If, for some pair of states, $X$ 
and 
$Y$,
$$(X, \varepsilon Z_0) \prec (Y, \varepsilon Z_1)$$
holds for a sequence of $\varepsilon$'s tending to zero and some states 
$Z_0$, $Z_1$, then 
$$X \prec Y.$$} 

{\it Remark:}  `Stability' means simply that one cannot increase the
set of accessible states with an infinitesimal grain of dust.




Besides these axioms the following property of state spaces, the 
`comparison hypothesis', plays a crucial role in our analysis in this 
section.  It will eventually be established for all state spaces after 
we have introduced some more specific axioms in later sections.

\item{{\bf CH)}} {\bf Definition:}
{\it We say the {\bf comparison hypothesis} (CH) holds for a state 
space if any two states $X$ and $Y$ in the space are comparable, i.e., 
$X\prec Y$ or $Y\prec X$.}


In the next subsection we shall show that, for every state space,
$\Gamma$, assumptions A1-A6, and CH for all two-fold scaled products,
$(1-\lambda) \Gamma \times \lambda \Gamma$, not just $\Gamma$ itself,
are in fact {\it equivalent} to the existence of an additive and
extensive entropy function that characterizes the order relation on the
states in {\it all} scaled products of $\Gamma$. Moreover, for each
$\Gamma$, this function is unique, up to an affine transformation of
scale, $S(X) \rightarrow a S(X)+B$. Before we proceed to the
construction of entropy we derive a simple property of the order
relation from assumptions A1-A6, which is clearly necessary if the
relation is to be characterized by an additive entropy function.
\medskip



{\bf THEOREM 2.1 (Stability implies cancellation law).}  {\it Assume
properties A1-A6, especially A6---the stability law.  Then the {\bf
cancellation law} holds as follows.  If $X,Y$ and $Z$ are states of three
(possibly distinct) systems then
$$(X,Z) \prec (Y,Z) \ \ \ {\rm implies} \ \ \ X \prec Y \qquad {\rm
(Cancellation \ Law)}.
$$}
\medskip

{\it Proof:}  Let $\varepsilon = 1/n$ with $n = 1,2,3, \dots$.  Then we
have
$$
\eqalignii{(X,\varepsilon Z) &\sima ((1-\varepsilon) X, \varepsilon X, 
\varepsilon Z) \quad &\hbox{(by A5)} \cr
&\prec ((1-\varepsilon) X, \varepsilon Y, \varepsilon Z) \quad
&\hbox{(by
A1, A3 and A4)} \cr
&\sima ((1-2 \varepsilon) X, \varepsilon X,
\varepsilon Y,  \varepsilon Z)
\quad &\hbox{(by A5)} \cr
&\prec ((1-2 \varepsilon) X, 
2 \varepsilon Y,  \varepsilon Z) \quad
&\hbox{(by A1, A3, A4 and A5).} \cr}
$$
By doing this $n = 1/\varepsilon$ times we find that $(X, \varepsilon Z)
\prec (Y, \varepsilon Z)$. By the stability axiom A6 we then have 
$X \prec Y $. \hfill\lanbox

{\it Remark:}  Under the additional assumption that $Y$ and $Z$ are
comparable states (e.g., if they are in the same state space for which
CH holds), the cancellation law is logically equivalent to the following
statement (using the consistency axiom A3):
$$
{\sl If} \  X \prec\prec Y 
\ {\sl then}\  (X,Z) \prec\prec (Y,Z) \ {\sl for \  all}\ Z.
$$

The cancellation law looks innocent enough, but it is really rather
strong.  It is a partial converse of the consistency condition A3 and
it says that although the ordering in $\Gamma_1 \times \Gamma_2 $ is
{\it not} determined simply by the order in $\Gamma_1$ and $\Gamma_2$,
there are limits to how much the ordering can vary beyond the minimal
requirements of A3. It should also be noted that the cancellation law
is in accord with our physical interpretation of the order relation in
Subsection II.A.2.; a ``spectator'', namely $Z$, cannot change the
states that are adiabatically accessible from $X$.

\bigskip

{\it Remark about `Adiabatic Processes':\ \ }
With the aid of the cancellation law we can now discuss the connection
between our notion of adiabatic accessibility and the textbook concept
of an `adiabatic process'. One problem we face is that this latter
concept is hard to make precise (this was  our reason for
avoiding it in our operational definition) and therefore the discussion 
must
necssearily be somewhat informal. The general idea of an adiabatic
process, however, is that the system of interest is locked in a
thermally isolating enclosure that prevents `heat' from flowing into or
out of our system. Hence, as far as the system is concerned, all the
interaction it has with the external world during an  adiabatic process
can be thought of as being accomplished by means of some mechanical or
electrical devices. Our operational definition of the relation $\prec$
appears at first sight to be based on more general processes, since we
allow an auxilary thermodynamical system as part of the device. We
shall now show that, despite appearances, our definition coincides with
the conventional one.

Let us temporarily denote by $\prec^*$ the relation between states based
on adiabatic processes, i.e., $X \prec^* Y$ if and only if there is a
mechanical/electrical device that starts in a state $M$ and ends up in a
state $M'$ while the system changes from $X$ to $Y$. We now assume that
the mechanical/electrical device can be restored to the initial state
$M$ from the final state $M'$ by adding or substracting mechanical
energy, and this latter process can be reduced to the raising or
lowering of a weight in a gravitational field.  (This can be taken as a
definition of what we mean by a 'mechanical/electrical device'. Note
that devices with 'dissipation' do not have this property.) Thus,
$X\prec^*Y$ means there is a process in which the mechanical/electrical
device starts in some state $M$ and ends up in the same state, a weight
moves from height $h$ to height $h'$, while the state of our system
changes from $X$ to $Y$.  In symbols, 
$$ 
(X,M,h)\longrightarrow (Y,M,h').\eqno(2.8) 
$$

In our definition of adiabatic accessibility, on the other hand, we have
some {\it arbitrary} device, which interacts with our system and  which
can generate or remove heat if desired.
There is no thermal enclosure. The important constraint is that the
device starts in some state $D$ and ends up in the same state $D$.  As
before a weight moves from height $h$ to height $h'$, while our system
starts in state $X$ and ends up in state $Y$. In symbols,
$$
(X,D,h) \longrightarrow (Y,D,h') \eqno(2.9)  .
$$
It is clear that (2.8) is a
special case of (2.9), so we conclude that $X\prec^*Y$ implies $X\prec
Y$. The device in (2.9) may consist of a thermal part in some state $Z$
and electrical and mechanical parts in some state $M$. Thus $D=(Z,M)$,
and (2.9) clearly implies that $(X,Z)\prec^*(Y,Z)$.

It is natural to assume that $\prec^*$ satisfies axioms A1-A6, just as
$\prec$ does.  In that case we can infer the cancellation law for
$\prec^*$, i.e.,  $(X,Z) \prec^*(Y,Z,)$ implies $X \prec^* Y$.  Hence,
$X\prec Y$ (which is what (2.9) says) implies $X\prec^*Y$. Altogether we
have thus shown that $\prec$ and $\prec^*$ are really the same relation.
In words: {\it adiabatic accessibility can always be achieved by an
adiabatic process applied to the system plus a device and, furthermore,
the adiabatic process can be simplified (although this may not be easy
to do  experimentally) by eliminating all thermodynamic parts of the
device, thus making the process an adiabatic one for the system alone.}

\vfill\eject
\bigskip
\noindent
{\subt D. The construction of entropy for a single system}
\bigskip

Given a state space $\Gamma$ we may, as discussed in Subsection I.A.1.,
construct its {\it multiple scaled copies}, i.e., states of the form $$
Y=(t_1Y_1,\dots,t_NY_N)
$$
with $t_i>0$, $Y_i\in\Gamma$. It 
follows from our assumption A5 that if CH (comparison hypothesis) holds
in the state space $\Gamma^{(t_1)} \times \cdots \times \Gamma^{(t_N)}$
with $t_1,...,t_N$ fixed, then any
other state of the same form, 
$Y'=(t_1'Y_1',\dots,t_M'Y_M')$ with $Y_i'\in\Gamma$ , is comparable to 
$Y$ provided 
$\sum_i t_i=\sum_jt'_j$ 
(but not, in general, if the sums are not equal). This is proved as 
follows for $N=M=2$; the easy extension to the general case is left to
the reader. Since $t_1+t_2 = t_1'+t_2'$ we can assume,  without loss of
generality, that $t_1-t_1' = t_2'-t_2 >0$, because the case 
$t_1-t_1' =0$ is already covered by CH (which was assumed) 
for $\Gamma^{(t_1)} \times  
\Gamma^{(t_2)}$. By the splitting axiom, A5, we have 
$(t_1Y_1,t_2Y_2) \sima (t_1'Y_1, (t_1-t_1')Y_1, t_2Y_2)$ and
$(t_1'Y_1',t_2'Y_2')\sima (t_1'Y_1', (t_1-t_1')Y_2', t_2Y_2')$.
The comparability now follows from CH on the space
$\Gamma^{(t_1')} \times \Gamma^{(t_1-t_1')} \times \Gamma^{(t_2)}$.

The  entropy principle for the states in the multiple scaled
copies of a single system will now be derived. More precisely, we shall
prove the following theorem:  \medskip

{\bf THEOREM 2.2 (Equivalence of entropy and assumptions A1--A6,
CH).} {\it  Let $ \Gamma$ be a state space and let $\prec$ be a
relation on the multiple scaled copies of $\Gamma$. The following
statements are equivalent.\hfill \item{(1)} The  relation $\prec$
satisfies axioms A1--A6, and CH holds for all multiple scaled copies
of $\Gamma$.  
\item{(2)} There is a function, $S_\Gamma$ on $\Gamma$ that
characterizes the relation in the sense that if
\noindent$t_1+\cdots+t_N=t'_1+\cdots
+t_M'$, (for all $N\geq 1$ and $M\geq 1$) then
$$
(t_1Y_1,...,t_NY_N) \ \prec \ (t_1^{\prime}Y_1^{\prime},
...,t_M^{\prime}Y_M^{\prime})
$$
holds if and only if 
$$
\sum_{i=1}^N t_i S_\Gamma(Y_i) \ \leq \ \sum_{j=1}^M t_j^{\prime} 
S_\Gamma(Y_j')\ .   \eqno (2.10)
$$

The function $S_\Gamma$ is uniquely determined on $\Gamma$, up to an 
affine
transformation, i.e., any other function $S_\Gamma^*$ on $\Gamma$ 
satisfying
(2.10) is of the form $S_\Gamma^*(X)=aS_\Gamma(X)+B$ with constants $a>0$ 
and $B$.}
\medskip

{\bf Definition.} A function $S_\Gamma$ on $\Gamma$ that characterizes 
the 
relation $\prec$ on the multiple scaled copies of $\Gamma$ in the sense
stated in the theorem is called an {\bf entropy function on} $\Gamma$.
\smallskip



We shall split the proof of Theorem 2.2 into Lemmas 2.1, 2.2, 2.3 and 
Theorem 
2.3 below.


At this point it is convenient to introduce the following
notion of {\bf generalized ordering}.  While $(a_1 X_1, a_2 X_2,
\dots, a_N X_N)$ has so far only been defined when all $a_i > 0$, we can 
{\it
define} the meaning of the relation
$$
(a_1 X_1, \dots , a_N X_N) \prec (a^\prime_1 X^\prime_1, \dots ,
a^\prime_M X^\prime_M) \eqno(2.11)
$$ for arbitrary $a_i \in \R$, $a^\prime_i \in \R$, $N$ and $M$ positive 
integers
and $X_i \in \Gamma_i$, $X^\prime_i \in \Gamma^\prime_i$ as follows.
If any $a_i$ (or
$a^\prime_i$) is zero we just ignore the corresponding term. 
Example: $(0X_{1},X_{2})\prec (2X_{3},0X_{4})$ means the same thing as
$X_{2}\prec 2X_{3}$. If any $a_i$ (or
$a^\prime_i$) is negative, just move $a_i X_i$ (or $a^\prime_i 
X^\prime_i$)
to the other side and change the sign of $a_i$ (or $a^\prime_i$).  
Example: 
$$
(2 X_1, X_2) \prec (X_3, - 5 X_4, 2X_5, X_6)
$$
means that
$$
(2X_1, 5 X_4, X_2) \prec (X_3, 2 X_5, X_6)
$$
in $\Gamma_1^{(2)} \times \Gamma_4^{(5)} \times \Gamma_2$ and $\Gamma_3
\times \Gamma_5^{(2)} \times \Gamma_6$.  (Recall that $\Gamma_a \times
\Gamma_b = \Gamma_b \times \Gamma_a)$.  It is easy to check, using the
cancellation law, that {\it the splitting and recombination axiom A5
extends to nonpositive scaling parameters}, i.e., axioms A1-A6 imply 
that  $X\sima (aX,bX)$ for all $a,b\in\R$ with $a+b=1$, if the 
relation $\prec$ for nonpositive $a$ and $b$ is understood in the 
sense just decribed.



For the definition of the entropy function we need the following lemma,
which depends crucially on the stability assumption A6 and on the
comparison hypothesis CH for the state spaces
$\Gamma^{(1-\lambda)}\times\Gamma^{(\lambda)}$.  \medskip

{\bf LEMMA 2.1}  {\it Suppose $X_0$ and $X_1$ are two points in 
$\Gamma$ with $X_0\prec\prec X_1$. For $\lambda\in\R$ define
$$
\S_\lambda = \{ X \in \Gamma : ((1 - \lambda) X_0, \lambda X_1) 
\prec X \}.\eqno(2.12)
$$ 
Then 

(i) For every $X \in \Gamma$ there is a $\lambda \in \R$ such that $X
\in \S_\lambda$.

(ii)  For every $X \in \Gamma$, $\sup \{ \lambda : X \in \S_\lambda \}
< \infty$.  } \medskip


{\it Remark.} Since $X\sima ((1-\lambda)X,\lambda X)$ by assumption A5, 
the definition of $\S_\lambda$ really involves the order relation on 
double scaled copies of $\Gamma$ (or on $\Gamma$ itself, if 
$\lambda=0$ or 1.)


{\it Proof of Lemma 2.1.} (i)  If $X_0 \prec X$ then 
obviously $X \in \S_0$ by axiom A2.  
For general $X$ we claim
that 
$$
(1 + \alpha) X_0 \prec (\alpha X_1,X) \eqno(2.13)
$$
for some $\alpha \geq 0$ and hence $((1 -\lambda) X_0, \lambda X_1)
\prec X$ with $\lambda = - \alpha$.  The proof relies on stability, A6,
and the comparison hypothesis CH (which comes into play for the first
time):  If (2.13) were not true, then by CH we would have
$$(\alpha X_1,X) \prec (1 + \alpha) X_0$$
for all $\alpha >0$ and so, by scaling, A4, and A5
$$
\left (X_1,\, {1 \over \alpha}X\right) \prec 
\left( X_0,\, {1 \over \alpha} X_0\right).
$$
By the stability axiom A6 this would imply $X_1 \prec X_0$ in
contradiction to $X_0 \prec\prec X_1$.

(ii)  If $\sup \{ \lambda : X \in \S_\lambda \} = \infty$, then for
some sequence of $\lambda$'s tending to infinity we would have
$((1-\lambda)X_0,\lambda X)\prec X$ and hence  $(X_0, \lambda X_1)
\prec (X, \lambda X_0)$ by A3 and A5. By A4 this implies $\left( {1
\over \lambda} X_0, X_1 \right) \prec \left( {1 \over \lambda} X, X_0
\right)$ and hence $X_1 \prec X_0$ by stability, A6.  \hfill\lanbox

We can now state our {\bf formula for the entropy function}.  If all
points in $\Gamma $ are adiabatically equivalent there is nothing to
prove (the entropy is constant), so we may assume that there are points
$X_0$, $X_1\in\Gamma$ with $X_0\prec\prec X_1$.  We then define for
$X\in\Gamma$ 
$$ 
S_\Gamma(X):=\sup\{\lambda:\ ((1-\lambda)X_0,\lambda X_1)\prec X\}. \eqno 
(2.14)
$$ 
(The symbol $a:=b$ means that $a$ is defined by $b$.) 
This $S_\Gamma$ will be referred to as the {\bf canonical
entropy} on $\Gamma$ with {\bf reference points} $X_0$ and $X_1$. 
This definition is illustrated in Figure 2.

\centerline{\sevenpoint ---- Insert Figure 2 here ----}

By Lemma 2.1 $S_\Gamma(X)$ is well defined and  $S_\Gamma(X)<\infty$ for
all $X$.  (Note that by stability we could replace $\prec$ by 
$\prec\prec$
in (2.14).) We shall now show that this $S_\Gamma$ has all the right
properties. The first step is the following simple lemma, which does not
depend on the comparison hypothesis.  \medskip

{\bf LEMMA 2.2 ($\prec$ is equivalent to $\leq$).}  
{\it Suppose $X_0
\prec\prec X_1$ are states and $a_0, a_1, a^\prime_0, a^\prime_1$
are real numbers
with $a_0 + a_1 = a^\prime_0 + a^\prime_1$.  Then the following are
equivalent.  \item{(i)}  $(a_0 X_0, a_1 X_1) \prec (a^\prime_0 X_0,
a^\prime_1 X_1)$ \item{(ii)}  $a_1 \leq a^\prime_1$ (and hence $a_0
\geq a^\prime_0$).  \smallskip\noindent In particular, $\sima$ holds in
(i) if and only if $a_1 = a^\prime_1$ and $a_0 = a^\prime_0$.}
\smallskip

{\it Proof:} We give the proof assuming that the numbers $a_0, a_1,
a^\prime_0, a^\prime_1$ are all positive and $a_0 + a_1 = a^\prime_0
+ a^\prime_1=1$. The other cases are similar. We write $a_1=\lambda$
and $a_1'=\lambda'$.

(i) $\Rightarrow$ (ii).  If $\lambda > \lambda^\prime$ then, by A5 and
A3, $((1 - \lambda) X_0, \lambda^\prime  X_1, (\lambda -
\lambda^\prime) X_1) \prec ((1 - \lambda) X_0, (\lambda
-\lambda^\prime) X_0, \lambda^\prime X_1)$.  By the cancellation law,
Theorem 2.1, $((\lambda - \lambda^\prime) X_1) \prec ((\lambda -
\lambda^\prime) X_0)$.  By scaling invariance, A5,  $X_1 \prec X_0$,
which contradicts $X_0 \prec\prec X_1$.  \hfill\break (ii)
$\Rightarrow$ (i).  This follows from the following computation.
$$
\eqalignii {((1-\lambda)X_0, \lambda X_1) &\sima
((1-\lambda^\prime)X_0, (\lambda^\prime - \lambda)X_0, \lambda X_1)
\quad &\hbox{(by axioms A3 and A5)} \cr &\prec ((1-\lambda^\prime)X_0,
(\lambda^\prime - \lambda)X_1, \lambda X_1) \quad &\hbox{(by axioms A3
and A4)} \cr &\sima ((1-\lambda^\prime)X_0, \lambda^\prime X_1)  \quad
&\hbox{(by axioms A3 and A5).} \cr}
$$ 
\hfill\lanbox

The next lemma will imply, among other things, that entropy is unique,
up to an affine transformation. \medskip

{\bf LEMMA 2.3 (Characterization of entropy).}  {\it Let $S_\Gamma$ 
denote the canonical entropy (2.14) on $\Gamma$ with respect to the 
reference points 
$X_0\prec\prec X_1$. If $X \in \Gamma$
then the equality
$$
\lambda = S_\Gamma (X)
$$
is equivalent to
$$
X \sima ((1 - \lambda) X_0, \lambda X_1).
$$}
\smallskip

 {\it Proof:}  First, if $\lambda = S_\Gamma(X)$ then, by the
definition of supremum, there is a sequence $\varepsilon_1 \geq
\varepsilon_2 \geq \dots \geq 0$ converging to zero, such that
$$
((1 - (\lambda - \varepsilon_n)) X_0, (\lambda - \varepsilon_n) X_1)
\prec X$$
for each $n$.  Hence, by A5,
$$((1 - \lambda) X_0, \lambda X_1, \varepsilon_n X_0) \sima ((1 -
\lambda + \varepsilon_n) X_0, (\lambda - \varepsilon_n) X_1,
\varepsilon_n X_1) \prec (X, \varepsilon_n X_1),$$ and thus $((1 -
\lambda) X_0, \lambda X_1) \prec X$ by the stability
property A6.  On the other hand, since $\lambda$ is the supremum we have
$$X \prec ((1 - (\lambda + \varepsilon) X_0, (\lambda + \varepsilon)
X_1)
$$
for all $\varepsilon > 0$ by the comparison hypothesis CH.  Thus,
$$
(X, \varepsilon X_0) \prec ((1 - \lambda) X_0, \lambda X_1,
\varepsilon X_1),
$$
so, by A6,  $X \prec ((1 - \lambda) X_0, \lambda X_1)$.  This shows that 
$X 
\sima
((1 - \lambda) X_0, \lambda X_1)$ when $\lambda = S_\Gamma(X)$.

Conversely, if $\lambda^\prime \in [0,1]$ is such that $X \sima ((1 -
\lambda^\prime) X_0, \lambda^\prime X_1)$, then $((1 - \lambda^\prime)
X_0, \lambda^\prime X_1) \sima ((1 - \lambda) X_0, \lambda X_1)$ by
transitivity.  Thus, $\lambda = \lambda^\prime$ by Lemma 2.2.
\hfill\lanbox
\smallskip
%%%%%%%%
{\it Remark 1:}  Without the comparison hypothesis we could find that
$S_\Gamma(X_0)= 0$ and $S_\Gamma(X) = 1$ for all $X$ such that $X_0 \prec 
X$.
\smallskip

{\it Remark 2:} {}From Lemma 2.3 and the cancellation law it
follows that the canonical entropy with reference points $X_0\prec\prec
X_1$ satisfies $0\leq S_\Gamma (X)\leq 1$ if and only if $X$ belongs to 
the
{\bf strip} $\Sigma (X_0, X_1)$ defined by
$$
\Sigma (X_0, X_1) := \{ X \in \Gamma : X_0 \prec X \prec X_1 \} \subset
\Gamma.$$
Let us make the dependence of the canonical entropy on $X_0$ 
and $X_1$ explicit by writing 
$$
S_\Gamma(X)=S_\Gamma(X\vert X_0,X_1) \ . \eqno(2.15)
$$ 
For
$X$ outside the strip we can then write 
$$
S_\Gamma (X \vert X_0, X_1)= S_\Gamma (X_1 \vert X_0, X)^{-1}
\qquad\hbox{if\ }X_1\prec X$$
and
$$
S_\Gamma (X \vert X_0, X_1)= -{S_\Gamma ( X_0\vert X, X_1)\over 
1-S_\Gamma 
(X_0 \vert X, X_1)}
\qquad\hbox{if\ }X\prec X_0.
$$
\smallskip
%\vfill\eject
{\tt Proof of Theorem 2.2:}

{\it (1) $\Longrightarrow$ (2):}   Put $\lambda_i=S_\Gamma(Y_i)$,
$\lambda_i'=S_\Gamma(Y_i')$. By Lemma 2.3  we know that $Y_i\sima
((1-\lambda_i)X_0,\lambda_i X_1)$ and
$Y_i'\sima ((1-\lambda_i')X_0,\lambda_i' X_1)$. By the consistency 
axiom A3   and the recombination axiom A5  it follows that
$$
(t_1Y_1,\dots,t_NY_N)\sima (\sum_i t_i(1-\lambda_i)X_0, \sum_i 
t_i\lambda_i X_1)
$$
and 
$$
(t_1'Y_1',\dots,t_N'Y_N')\sima (\sum_i t_i'(1-\lambda_i')X_0, \sum_i 
t_i'\lambda_i' X_1) \ .
$$
Statement (2) now follows from Lemma 2.2. 
The implication (2) $\Longrightarrow$ (1) is obvious.


The  proof of Theorem 2.2 is now complete except for the uniqueness part.
We formulate this part separately in Theorem 2.3 below, which is slightly 
stronger than
the last assertion in Theorem 2.2.
It implies that an entropy function for the multiple scaled copies of 
$\Gamma$ is 
already uniquely 
determined, up to an affine transformation, by the relation on states of 
the form $((1-\lambda)X,\lambda Y)$, i.e., it requires only the case
$N=M=2$, in the notation of Theorem 2.2.
\medskip

{\bf THEOREM 2.3 (Uniqueness of entropy)} {\it If $S_\Gamma^*$ is a 
function on $\Gamma$ that satisfies
$$
((1-\lambda)X,\lambda Y)\prec((1-\lambda)X',\lambda Y')
$$
if and only if
$$
(1-\lambda)S_\Gamma^*(X)+\lambda 
S_\Gamma^*(Y)\leq(1-\lambda)S_\Gamma^*(X')+\lambda 
S_\Gamma^*(Y'),
$$
for all $\lambda\in\R$ and $X,Y,X',Y'\in\Gamma$, then 
$$
S_\Gamma^*(X)=aS_\Gamma(X)+B
$$ 
with 
$$
a=S_\Gamma^*(X_1)-S_\Gamma^*(X_0)>0,\qquad B=S_\Gamma^*(X_0).
$$
Here $S_\Gamma$ is the canonical entropy on $\Gamma$ 
with reference points $X_0\prec\prec X_1$.}
\medskip

{\it Proof:} This follows immediately from Lemma 2.3, which says that 
for every $X$ there is a unique $\lambda$, namely 
$\lambda=S_\Gamma(X)$, such that 
$$X\sima 
((1-\lambda)X,\lambda X)\sima ((1-\lambda)X_0,\lambda X_1).$$
Hence, by the hypothesis on $S_\Gamma^*$, and $\lambda=S_\Gamma(X)$, we 
have
$$
S_\Gamma^*(X)=(1-\lambda)S_\Gamma^*(X_0)+
\lambda S_\Gamma^*(X_1) = [S_\Gamma^*(X_1)-S_\Gamma^*(X_0)]S_\Gamma(X)
+S_\Gamma^*(X_0).
$$
The hypothesis on $S_\Gamma^*$ also implies that $a:= 
S_\Gamma^*(X_1)-S_\Gamma^*(X_0) >0$, 
because $X_0\prec\prec X_1$.\hfill\lanbox
\medskip

{\it Remark:} Note that $S_\Gamma^*$ is defined on $\Gamma$ and satisfies
$S_\Gamma^*(X) = aS_\Gamma(X)+B$ there. On the space $\Gamma^{(t)}$ 
a corresponding entropy is, {\it by definition}, given by
$S_{\Gamma^{(t)}}^*(tX) = tS_\Gamma^*(X)= atS_\Gamma(X) + tB =
aS_\Gamma^{(t)}(tX) + tB$, where $S_\Gamma^{(t)}(tX)$ is the canonical
entropy on $\Gamma^{(t)}$ with reference points $tX_0, tX_1$.  Thus,
$S_{\Gamma^{(t)}}^*(tX)\neq  aS_\Gamma^{(t)}(tX) +B$ \ (unless $B=0$,
of course).  \bigskip

It is apparent from formula (2.14) that the definition of the canonical
entropy function on $\Gamma$ involves only the relation $\prec$ on the
double scaled products $\Gamma^{(1-\lambda)}\times \Gamma^{(\lambda)}$
besides the reference points $X_0$ and $X_1$. Moreover, the canonical
entropy uniquely characterizes the relation on all multiple scaled 
copies
of $\Gamma$, which implies in particular that CH holds for all multiple
scaled copies. Theorem 2.3 may therefore be rephrased as follows: 
\medskip

{\bf THEOREM 2.4 (The relation on double scaled copies determines the
relation everywhere).} {\it Let $\prec$ and $\prec^*$ be two relations on
the multiple scaled copies of $\Gamma$ satisfying axioms A1-A6, and also
CH for $\Gamma^{(1-\lambda)}\times \Gamma^{(\lambda)}$ for each fixed
$\lambda\in[0,1]$.  If $\prec$ and $\prec^*$ coincide on
$\Gamma^{(1-\lambda)}\times \Gamma^{(\lambda)}$ for each 
$\lambda\in[0,1]$,
then $\prec$ and $\prec^*$ coincide on all multiple scaled copies of
$\Gamma$, and CH holds on all the multiple scaled copies.}
\medskip
 The proof of Theorem 2.2 is now complete.

\bigskip\noindent
{\subt E.  Construction of a universal entropy in the absence of mixing}
\bigskip

In the previous subsection we showed how to construct an entropy 
for a single system, $\Gamma$, that exactly describes the 
relation $\prec$ within the states obtained by forming multiple
scaled copies of $\Gamma$. It is unique up to a multiplicative
constant $a>0$ and an additive constant $B$, i.e., to within an
affine transformation. We remind the reader that this entropy was
constructed by considering just the product of two scaled copies of 
$\Gamma$, but 
our axioms implied that it automatically worked for {\it all} 
multiple scaled copies of $\Gamma$. We shall refer to $a$ and $B$ as {\bf 
entropy constants} for the system $\Gamma$. 

Our goal is to put these entropies together
and show that they behave in the right way on products of
arbitrarily many copies of {\it different} systems. Moreover,
this \lq universal\rq\ entropy will be unique up to {\it one}
multiplicative constant---but still many additive constants.
The central question here is one of {\it \lq calibration\rq\ },
which is to  say that the multiplicative constant in front of 
each elementary entropy has to be chosen in such a way that
the additivity rule (2.4) holds. It is not even obvious yet
that the additivity can be made to hold at all, whatever the choice
of constants.

Let us note that the number of additive constants depends heavily on the
kinds of adiabatic processes available. The system consisting of one
mole of hydrogen mixed with one mole of helium and the system
consisting of one mole of hydrogen mixed with two moles of helium are
different.  The additive constants are independent {\it unless} a
process exists in which both systems can be unmixed, and thereby making
the constants comparable. In nature we expect only 92 constants, one for
each element of the periodic table, unless we allow nuclear processes as
well, in which case there are only two constants (for neutrons and for
hydrogen).  On the other hand, if un-mixing is not allowed uncountably
many constants are undetermined. In Section VI we address the question
of adiabatic processes that unmix mixtures and reverse chemical
reactions. That such  processes exist is not so obvious.

To be precise, the principal goal  of this subsection is the proof of
the following Theorem 2.5, which is a  case of the entropy principle
that is special in that it is restricted to processes that do not
involve  mixing or chemical reactions. It is a generalization of Theorem
2.2. 
\medskip


{\bf THEOREM 2.5 (Consistent entropy scales). } {\it Consider a family of 
systems fulfilling the following requirements:

\item{(i)} 
The state spaces of any two systems in the family are disjoint sets, 
i.e., 
every 
state of a system in the family belongs to exactly one state space.

\item{(ii)} All multiple scaled products of systems in the family
belong also to the family.  

\item{(iii)} Every system in the family satisfies the 
comparison hypothesis. 

For each state space $\Gamma$ of a system in the family let
$S_{\Gamma}$ be some d
efinite entropy function on $\Gamma$. Then 
there are constants $a_{\Gamma}$ and $B_{\Gamma}$ such that the 
function $S$, defined for all states in all $\Gamma$'s by
$$
S(X)= a_{\Gamma} S_{\Gamma} (X)+ B_{\Gamma}
$$
for $X\in \Gamma$, has the following properties:
\smallskip
\item{a).} If $X$ and $Y$ are  in the same state space then
$$
X\prec Y \quad\quad \hbox{\rm if and only if} \quad\quad S(X)\leq 
S(Y).
$$ \smallskip

\item{b).} $S$ is additive and extensive, i.e.,
$$
S(X,Y) = S(X)+S(Y). \eqno (2.4)
$$ 
and, for $t>0$,
$$
S(tX) = tS(X). \eqno (2.5)
$$  }
\medskip
{\it Remark.\/} Note that $\Gamma_1$ and $\Gamma_1\times \Gamma_2$ are 
disjoint 
as sets for any (nonempty) state spaces $\Gamma_1$ and $\Gamma_2$.
\medskip

{\it Proof:} Fix some system $\Gamma_0$ and two points $Z_0\prec 
\prec Z_1$
in  $\Gamma_0$.  In each state space $\Gamma$ choose some fixed point 
$X_{\Gamma} \in \Gamma$ in such a way that the identities
$$\eqalignno{
X_{\Gamma_1  \times \Gamma_2}&= (X_{\Gamma_1}, X_{\Gamma_2}) 
&(2.16)\cr  \noalign{\smallskip} 
X_{t\Gamma}                 &= tX_{\Gamma}&(2.17)\cr }
$$
hold.  With the aid or the axiom of choice this can  be achieved by
considering the formal vector space spanned by all systems and choosing a
Hamel basis of systems $\{\Gamma_{\alpha}\}$ in this space such that 
every
system can be written uniquely as a scaled product of a finite number of
the $\Gamma_{\alpha}$'s. (See Hardy, Littlewood and Polya, 1934). The
choice of an arbitrary state $X_{\Gamma_{\alpha}}$ in each of these
`elementary' systems $\Gamma_{\alpha}$ then defines for each $\Gamma$ a
unique $X_{\Gamma}$ such that (2.17) holds.  (If the reader does not wish
to invoke the axiom of choice then an alternative is to hypothesize that
every system has a unique decomposition into elementary systems; the 
simple
systems considered in the next section obviously qualify as the 
elementary
systems.) 
 
For $X\in \Gamma$ we consider the space $\Gamma \times \Gamma_0$ with 
its
canonical entropy as defined in (2.14), (2.15) relative to the points
$(X_{\Gamma}, Z_0)$ and $(X_{\Gamma}, Z_1)$. Using this function we 
define
$$
S(X)= S_{\Gamma \times \Gamma_0}((X,Z_0) \, \, \vert \, \, (X_{\Gamma}
, Z_0),(X_{\Gamma} , Z_1)). \eqno(2.18)
$$

Note: Equation (2.18) fixes the entropy of $X_{\Gamma}$ to be zero.

Let us denote $S(X) $ by $\lambda$ which, by Lemma 2.3, is 
characterized by
$$
(X,Z_0) \sima ( (1-\lambda ) (X_{\Gamma} , Z_0) , \lambda  
(X_{\Gamma} , Z_1)).
$$
By the cancellation law this is equivalent to
$$
(X,\lambda Z_0)\sima  (X_{\Gamma}, \lambda Z_1)).  \eqno(2.19)
$$

By (2.16) and (2.17) this immediately implies the additivity and
extensivity of $S$.  Moreover, since $X\prec Y$ holds if and only if $(X,
Z_0) \prec (Y,Z_0) $ it is also clear that $S$ is an entropy function on
any $\Gamma$.  Hence $S$ and $S_{\Gamma}$ are related by an affine
transformation, according to Theorem 2.3.  \hfill \lanbox

\medskip


{\bf Definition (Consistent entropies).} A collection of entropy 
functions $S_\Gamma$ on state spaces $\Gamma$ is called {\it 
consistent} if the appropriate linear combination of the functions is 
an entropy function on all multiple scaled products of these state 
spaces.  In other words, the set is consistent if the multiplicative 
constants $a_{\Gamma}$, referred to in Theorem 2.5, can all be chosen 
equal to 1.  \smallskip


\underbar{{\it Important Remark:}}
{}From the definition, (2.14), of the canonical entropy
and (2.19) it follows that  the entropy (2.18) is given by the formula
$$
S(X) = \sup \{ \lambda \, \, : \, \, (X_{\Gamma}, \lambda Z_1)
\prec (X , \lambda Z_0) \}   \eqno (2.20)
$$
for $X\in\Gamma$.  The auxiliary system $\Gamma_0$ can thus be 
regarded as an `entropy meter' in the spirit of (Lewis and Randall, 
1923) and (Giles, 1964).  Since we have chosen to define the entropy 
for each system independently, by equation (2.14), the role of $\, 
\Gamma_0$ in our approach is solely to calibrate the entropy of 
different systems in order to make them consistent.

\medskip

{\it Remark about the photon gas:\/} As we discussed in Section II.B
the photon gas is special and there are two ways to view it.  One way
is to regard the scaled copies $\Gamma^{(t)}$ as distinct systems and
the other is to say that there is only one $\Gamma$ and the scaled
copies are identical to it and, in particular, must have exactly the
same entropy function.  We shall now see how the first point of view
can be reconciled with the latter requirement.  Note, first, that in
our construction above we cannot take the point $(U,V)=(0,0)$ to be the
fiducial point $X_{\Gamma}$ because $(0,0)$ is not in our state space
which, according to the discussion in Section III below, has to be an
open set and hence cannot contain any of its boundary points such as
$(0,0)$. Therefore, we have to make another choice, so let us take
$X_{\Gamma}= (1,1)$. But the construction in the proof above sets
$S_{\Gamma} (1,1)= 0$ and therefore $S_{\Gamma}(U,V) $ will not have
the homogeneous form $S^{\rm hom}(U,V)= V^{1/4}U^{3/4}$.  Nevertheless,
the entropies of the scaled copies will be extensive, as required by
the theorem.  If one feels that all scaled copies should have the same
entropy (because they represent the same physical system) then the
situation can be remedied in the following way: With $S_{\Gamma}(U,V) $
being the entropy constructed as in the proof using $(1,1)$, we note
that $S_{\Gamma}(U,V) = S^{\rm hom}(U,V) + B_{\Gamma}$ with the
constant $B_{\Gamma}$  given by $B_{\Gamma}= -S_{\Gamma}(2,2)$. This
follows from simple algebra and the fact that we know that the entropy
of the photon gas constructed in our proof must equal $S^{\rm hom}$ to
within an additive constant. (The reader might ask how we know this and
the answer is that the entropy of the `gas' is unique up to additive
and multiplicative constants, the latter being determined by the system
of units employed.  Thus, the entropy determined by our construction
must be the `correct entropy', up to an additive constant, and this
`correct entropy' is what it is, as determined by physical measurement.
Hopefully it agrees with the function deduced in (Landau and Lifschitz,
1969).) Let us use our freedom to alter the additive constants as we
please, provided we maintain the extensivity condition (2.5).  It will
not be until Section VI that we have to worry about the additive
constants {\it per se} because it is only there that mixing and
chemical reactions are treated.  Therefore, we redefine the entropy of
the state space $\Gamma$ of the photon gas to be $S^*(U,V) :=
S_{\Gamma}(U,V) + S_{\Gamma}(2,2)$.  which is the same as $S^{\rm
hom}(U,V)$. We also have to alter the entropy of the scaled copies
according to the rule that preserves extensivity, namely
$S_{\Gamma^{(t)}}(U,V) \rightarrow S_{\Gamma^{(t)}}(U,V)
+tS_{\Gamma}(2,2) =S_{\Gamma^{(t)}}(U,V) + S_{\Gamma^{(t)}}(2t,2t) =
S^{\rm hom}(U,V)$.  In this way, all the scaled copies now have the
same (homogeneous) entropy, but we remind the reader that the same
construction could be carried out for any material system with a
homogeneous (or, more exactly an affine) entropy function---if one
existed. {}From the thermodynamic viewpoint, the photon gas is unusual
but not special. \bigskip \bigskip

\bigskip\noindent
{\subt F. Concavity of entropy}
\bigskip                                                        

Up to now we have not used, or assumed, any geometric property of a
state space $\Gamma$.  It is an important stability  property of
thermodynamical systems, however, that the entropy function is a {\it
concave} function of the state variables ---a requirement that was
emphasized by Maxwell, Gibbs, Callen and many others.  Concavity also
plays an important role in the definition of temperature, as in section
V.

In order to have this concavity it is first necessary to make the
state space on which entropy is defined into a convex set, and for
this purpose the choice of coordinates is important. Here, we begin
the discussion of concavity by discussing this geometric property
of the underlying state space and some of the consequences of the
{\it convex combination axiom} A7 for the relation $\prec$, to be given
after the following definition.


{\bf Definition:} By a {\bf state space with  a 
convex structure}, or simply a {\bf convex state space}, we 
mean a
state space $\Gamma$, that is a convex subset of some 
linear space, e.g., $\R^n$. That is, if $X$ and $Y$ are any two 
points in $\Gamma$ and if $0 \leq t \leq 1$,  then
the point $tX + (1-t)Y$ is a well-defined point in $\Gamma$. 
A {\it concave function}, $S$,  on $\Gamma$ is one satisfying the
inequality
$$
S(tX + (1-t)Y) \geq tS(X) + (1-t)S(Y).  \eqno(2.21)
$$


Our basic convex combination axiom for the relation $\prec$ is the
following.
\medskip


\item{\bf A7)}  {\bf Convex combination.} 
Assume $X$ and $Y$ are states in the same {\it convex} state space,
$\Gamma$. For $t \in [0,1]$ let $tX$ and  
$(1-t)Y$ be the corresponding states of their $t$ scaled and
$(1-t)$ scaled copies, respectively. Then the point $(t X, (1-t)
Y)$ in the product space $\Gamma^{(t)}\times
\Gamma^{(1-t)}$ satisfies
$$ 
(t X, (1-t) Y) \prec t X + (1-t)Y\ . \eqno(2.22) 
$$
Note that the right side of (2.22) is in $\Gamma$ and is defined 
by ordinary
convex combination of points in the convex set $\Gamma$.
\medskip

The physical meaning of A7 is more or less evident, but it is essential
to note that the 
convex structure depends heavily on the choice of coordinates for
$\Gamma$.  A7 means that if we take a bottle containing $1/4$ moles of
nitrogen and one containing $3/4$ moles (with possibly different
pressures and densities), and if we mix them together, then among the
states of one mole of nitrogen that can be reached adiabatically there
is one in which the energy is the sum of the two energies and, likewise,
the volume is the sum of the two volumes. Again, we emphasize that the
choice of energy and volume as the (mechanical) variables with which we
can make this statement is an important assumption. If, for example,
temperature and pressure were used instead, the statement would not only
not hold, it would not even make much sense.


The physical example above seems not exceptionable for liquids and 
gases.  On the other hand it is not entirely clear how to ascribe an 
operational meaning to a convex combination in the state space of a 
solid, and the physical meaning of axiom A7 is not as obvious in this 
case.  Note, however, that although convexity is a global property, it 
can often be inferred from a local property of the boundary.  (A 
connected set with a smooth boundary, for instance, is convex if every 
point on the boundary has a neighbourhood, whose intersection with the 
set is convex.)  In such cases it suffices to consider convex 
combinations of points that are close together and close to the 
boundary.  For small deformation of an isotropic solid the six strain 
coordinates, multiplied by the volume, can be taken as work 
coordinates.  Thus, A7 amounts to assuming that a convex combination 
of these coordinates can always be achieved adiabatically.  See, e.g., 
(Callen, 1985).


If $X \in \Gamma$ we denote by $A_X$ the set $\{ Y \in \Gamma : X
\prec Y \}$. $A_X$ is called the {\bf forward sector } of $X$ in
$\Gamma$.  More generally, if $\Gamma^\prime $ is another system,
we call the set
$$
\{Y\in \Gamma': X\prec Y\},
$$
the forward sector of $X$ in $\Gamma^\prime $.  

Usually this concept is applied to the case in which  $\Gamma$ and
$\Gamma^\prime $ are identical, but it can also be useful in cases in
which one system is changed into another; an example is the mixing of
two liquids in two containers (in which case $\Gamma $ is a compound
system) into  a third vessel containing the mixture (in which case
$\Gamma^\prime $ is simple).  


The main effect of A7 is that forward sectors are convex sets. 
\medskip


{\bf THEOREM 2.6} {\bf (Forward sectors are convex).} {\it Let
$\Gamma$ and $\Gamma'$ be state spaces of two systems, with
$\Gamma'$ a convex state space. Assume that A1--A5 hold for
$\Gamma$ and $\Gamma'$ and, in addition, A7 holds for
$\Gamma'$. Then the forward sector of $X$ in $\Gamma'$, defined
above, is a {\it convex\/} subset of $\Gamma'$ for each $X\in
\Gamma$.}
\smallskip

{\it Proof:\/} Suppose $X\prec Y_1$ and $X\prec Y_2$ and that $0<t <1$.
We want to show that $X\prec t Y_1+(1-t)Y_2$. (The right side defines,
by ordinary vector addition, a point in the convex set $\Gamma'$.  )
First, $X\prec (t X,(1-t)X)\in \Gamma^{(t)}\times \Gamma^{(1-t)}$, by
axiom A5. Next, $(t X,(1-t)X)\prec (t Y_1,(1-t)Y_2)$ by the
consistency axiom A3 and the scaling invariance axiom
A4. Finally, $(t Y_1,(1-t)Y_2)\prec t Y_1+(1-t)Y_2$ by the convex
combination axiom A7.\nobreak\hfill\lanbox
\medskip

Figure 3 illustrates this theorem in the case $\Gamma = \Gamma'$. 

\centerline{\sevenpoint ---- Insert Figure 3 here ----}

{\bf THEOREM 2.7 (Convexity of $\S_{\lambda}$).}  {\it Let the sets 
$\S_\lambda 
\subset \Gamma$ 
be defined as in (2.12) and assume the state space $\Gamma$ satisfies the 
convex combination axiom A7 
in addition to A1-A5. Then:

(i) $\S_\lambda$ is convex. \hfill 

(ii) If $X\in \S_{\lambda_1}$, $Y\in \S_{\lambda_2}$ and 
$0\leq t\leq 1$,  then $tX+(1-t)Y\in {\cal
S}_{t\lambda_1+(1-t)\lambda_2}$. \hfill 
}
\medskip

{\it Proof.} (i) This follows immediately from the scaling, splitting and
convex combination axioms A4, A5 and A7.


(ii) This is proved by splitting, moving the states of the subsystems 
into 
forward sectors and bringing the subsystems together at the end.  More 
precisely, defining $\lambda=t\lambda_1+(1-t)\lambda_2$ we have to show 
that $((1 - \lambda) X_0, \lambda X_1) \prec tX + (1 - t)Y$.  Starting 
with 
$((1-\lambda)X_0,\lambda X_1)$ we split $(1-\lambda)X_0$ into 
$(t(1-\lambda_1)X_0, (1-t)(1-\lambda_2)X_0)$ and $\lambda X_1$ into 
$(t\lambda_1X_1,(1-t)\lambda_2X_1)$.  Next we consider the states 
$(t(1-\lambda_1)X_0,t\lambda_1X_1)$ and 
$((1-t)(1-\lambda_2)X_0,(1-t)\lambda_2X_1)$.  By scaling
invariance A4 and the splitting property A5 we 
can pass from the former to $(t(1-\lambda_1)X,t\lambda_1X)$ and from the 
latter to $((1-t)(1-\lambda_2)Y,(1-t)\lambda_2Y)$.  Now we combine the 
parts 
of $(t(1-\lambda_1)X,t\lambda_1X)$ to obtain $tX$ and the parts of 
$((1-t)(1-\lambda_2)Y,(1-t)\lambda_2Y)$ to obtain $(1-t)Y$, and finally 
we 
use the convex combination property A7 to reach 
$tX+(1-t)Y$.\nobreak\hfill\lanbox

\bigskip
\medskip

{\bf THEOREM 2.8 (Concavity of entropy).}  
{\it Let $\Gamma$ be a convex state space. Assume 
axiom A7 in addition to A1-A6, and CH for multiple scaled copies of 
$\Gamma$. 
Then the entropy  $S_{\Gamma}$ defined by (2.14) is a  concave function 
on 
$\Gamma$.
Conversely, if $S_{\Gamma}$ is concave,
then axiom A7 necessarily holds a-fortiori.
}


{\it Proof:}  If $X \in \S_{\lambda_1}, Y \in \S_{\lambda_2}$, then by 
Theorem
2.7, (ii), $t X + (1 - t)Y \in \S_{t \lambda_1 + (1-t)\lambda_2}$, for 
$t,
\lambda_1, \lambda_2 \in [0,1]$.  By definition, this implies $S_{\Gamma} 
(t X +
(1 - t)Y) \geq t \lambda_1 + (1 - t)\lambda_2$.  Taking the supremum over 
all
$\lambda_1$ and $\lambda_2$ such that $X \in \S_{\lambda_1}, Y \in
\S_{\lambda_2}$, then gives $S_{\Gamma}(t X + (1 - t)Y) \geq t S_{\Gamma} 
(X) +
(1 - t) S_{\Gamma}(Y)$.  The converse is obvious.\hfill\lanbox

\bigskip\noindent
{\subt G. Irreversibility and Carath\'eodory's principle}
\bigskip

One of the milestones in the history of the second law is
Carath\'eodory's attempt to formulate the second law in terms of purely
local properties of the equivalence relation $\sima$.  The
disadvantage of the purely local formulation is, as we said earlier, the
difficulty of deriving a globally defined concave entropy function.
Additionally, Carath\'eodory relies on differentiability (differential
forms), and we would like to avoid this, if possible, because physical
systems do have points (e.g., phase transitions) in their state spaces
where differentiability fails. Nevertheless, Carath\'eodory's idea
remains a powerful one and it does play  an important role in the story.
We shall replace it by a seemingly more natural idea, namely the
existence of irreversible processes.  {\it The existence of many such
processes lies at the heart of thermodynamics.\/} If they did not exist,
it would mean that nothing is forbidden, and hence there would be no
second law. We now  show the relation between the two concepts. There
will be no mention of differentiability, however. 

Carath\'eodory's principle has been criticized (see, for example, the 
remark attributed to Walter in Truesdell's paper in (Serrin, 1986, 
Chapter 5)) on the ground that this principle does not tell us where 
to look for a non adiabatic process that is supposed, by the 
principle, to exist in every neighborhood of every state.  In Sect.  
III and V we show that this criticism is too severe because the 
principle, when properly interpreted, shows exactly where to look and, 
in conjunction with the other axioms, it leads to the Kelvin-Planck 
version of the second law.

\medskip

{\bf THEOREM 2.9 (Carath\'eodory's principle and irreversible 
processes).}
{\it Let $ \Gamma$ be a state space that is  a convex subset of ${\bf 
R}^n$
and assume that axioms A1--A7 hold on $ \Gamma$. Consider the following
two statements. 
\item{(1)}{\bf Existence of irreversible processes:}
For every point $X \in   \Gamma$ there is a $Y \in  \Gamma$ such that
$X \prec \prec Y$.
\item{(2)}{\bf Carath\'eodory's principle:} In every
neighborhood of every $X \in   \Gamma$ there is a point $Z\in  \Gamma$
such that $X \sima Z$ is false. 
\smallskip
Then (1) always implies (2). Indeed,  (1) implies the stronger
statement that there is a $Z$ such that $X \prec Z$ is false. 
On the other hand, if all the forward sectors in $ \Gamma$ 
have non-empty interiors (i.e., they are not contained in lower 
dimensional
hyperplanes) then (2) implies (1). }
\medskip


{\it Proof: \/}  
Suppose that for some $X \in   \Gamma$ there is  a neighborhood, ${\cal
N}_X$ of $X$ such that ${\cal N}_X$ is contained in $A_X$, the forward
sector of $X$. (This is the negation of the statement that in every
neighbourhood of every $X$ there is a $Z$ such that $X\prec Z$ is
false.) Let $Y\in A_X$ be arbitrary.  By the convexity of $A_X$ (which
is implied by the axioms), $X$ is an interior point of a line segment
joining $Y$ and some point $Z\in {\cal N}_X$.
By axiom A7, we thus have 
$$
((1-\lambda )Z,\lambda Y) \prec X \sima ((1-\lambda )X,\lambda X)
$$
for some $\lambda \in (0,1)$. But we also have that $((1-\lambda)X,
\lambda Y)\prec ((1-\lambda )Z,\lambda Y) $ since $Z\in A_X$. This
implies, by the cancellation law, that $Y\prec X$. 
Thus we conclude that for some $X$, we have that $X\prec Y$ implies 
$X\sima 
Y$. This contradicts (1). In particular, we have shown that (1)
$\, \Rightarrow$(2).
\smallskip

Conversely, assuming that (1) is false, there is a point 
$X_0$
whose forward sector is given by $A_{X_0} = \{ Y:Y\sima X_0 \}$. Let 
$X$ be an interior point of $A_{X_0}$, i.e., there is a neighborhood
of $X$, ${\cal N}_{X}$, which is entirely contained in  $A_{X_0}$.
All points in ${\cal N}_{X}$ are adiabatically equivalent to $X_0$, 
however, 
and hence to $X$, since $X\in {\cal N}_{X}$. Thus, (2) is false. 
\hfill\lanbox

\bigskip\noindent
{\subt H. Some further results on uniqueness}
\bigskip

As stated in Theorem 2.2, the existence of an entropy function on a state 
space $\Gamma$ is equivalent to the axioms A1-A6 and CH for the multiple 
scaled copies of $\Gamma$.  The entropy function is unique, up to an 
affine change of scale, and according to  formula (2.14) it is even 
sufficient to know the relation on the double scaled copies 
$\Gamma^{(1-\lambda)}\times\Gamma^{(\lambda)}$ in order to compute the 
entropy.  This was the observation behind the uniqueness Theorem 2.4
which stated 
that the restriction of the relation $\prec$ to the double scaled copies 
determines the relation everywhere.

The following very general result shows that it is in fact not 
necessary to know $\prec$ on all 
$\Gamma^{(1-\lambda)}\times\Gamma^{(\lambda)}$ to determine the 
entropy, provided the relation is such that the range of the entropy 
is connected.  In this case $\lambda=1/2$ suffices.  By Theorem 2.8 
the range of the entropy is necessarily connected if the convex 
combination axiom A7 holds.  \medskip

{\bf THEOREM 2.10 (The relation on $\Gamma\times\Gamma$ determines 
entropy).}  
{\it Let $\Gamma$ be a set
and $\prec$ a relation on $\Gamma\times \Gamma$.  Let
$S$ be a real valued function on $\Gamma$ satisfying the following 
conditions:
\item{(i)} $S$ characterizes the relation on $\Gamma\times \Gamma$ in the 
sense that 
$$(X,Y)\prec (X',Y')\qquad\hbox{\sl if and only if}\qquad S(X)+S(Y)\leq
S(X')+S(Y')$$
\item{(ii)} The range of $S$ is an interval (bounded or unbounded and 
which
could even be a point).

Let $S^*$ be another function on 
$\Gamma$ satisfying condition (i). Then $S$ and 
$S^*$ are affinely related, i.e., there are numbers 
$a > 0$ and $B$ such that $S^*(X) = a S(X) + B$ 
for all $X \in \Gamma$. In particular, $S^*$ must satisfy condition 
(ii). }
\medskip


{\it Proof:}  In general, if $F$ and $G$ are any two real valued 
functions 
on $\Gamma
\times \Gamma$, 
such that $F(X,Y)\leq F(X',Y')$ if and only if $G(X,Y)\leq G(X',Y')$,
it is an easy logical exercise to show that there is a
monotone increasing function $K$ (i.e., $x\leq y$ implies $K(x)\leq 
K(y)$)
defined on the range of $F$, 
so that $G = K \circ F$.  In our
case $F(X,Y)=S(X) + S(Y)$. If the range of $S$ is the interval $L$ then 
the
range of $F$ is $2L$. Thus $K$, which is
defined on $2L$, satisfies
$$
K(S(X) + S(Y)) = S^* (X) + S^* (Y) \eqno(2.23)
$$
for all $X$ and $Y$ in $\Gamma$ because both $S$ and $S^*$ satisfy 
condition (i).  For convenience, define $M$ on $L$ by
$M(t) = \mfr1/2 K (2t)$.  If we now set $Y = X$ in (1) we obtain 
$$
S^* (X) = M (S(X)), \quad X \in \Gamma \eqno(2.24)
$$
and (2.23) becomes, in general,
$$ M \left( {x+y \over 2} \right) = \mfr1/2 M(x) + \mfr1/2 
M(y)\eqno(2.25)
$$
for all $x$ and $y$ in $L$.  Since $M$ is monotone, it is bounded on all 
finite subintervals of $L$. Hence (Hardy, Littlewood, Polya 1934)
$M$ is both concave and convex in the usual sense, i.e.,
$$
M (t x + (1- t) y) = t M(x) + (1 - t) M(y)
$$
for all $0 \leq t \leq 1$ and $x,y \in L$.  {}From this it follows
that $M(x) = a x + B$ with $a\geq 0$.  If $a$ were
zero then $S^*$ would be constant on $\Gamma$ which would imply that
$S$ is constant as well.  In that case we could always replace $a$
by 1 and replace $B$ by $B-S(X)$.
 \hfill\lanbox

{\it Remark:}  It should be noted that Theorem 2.10 does not rely on
any structural property of $\Gamma$, which could be any abstract set.
In particular, continuity plays no role; indeed it cannot be defined
because no topology on $\Gamma$ is assumed. The only residue of
``continuity" is the requirement that the range of $S$ be an interval.

That condition (ii) is not superfluous for the uniqueness
theorem may be seen from the following simple counterexample.

{\bf EXAMPLE:} Suppose the state space $\Gamma$ consists of 3 points,
$X_0$, $X_1$ and $X_2$, and let $S$ and $S^*$ be defined by $S(X_0)=
S^*(X_0)=0$, $S(X_1)=S^*(X_1)=1$, $S(X_2)$=3, $S^*(X_2)$=4.  These
functions correspond to the same order relation on $\Gamma\times
\Gamma$, but they are not related by an affine transformation.

The following sharpening of Theorem 2.4 is an immediate corollary of 
Theorem 2.10 in the 
case that the convexity axiom A7 holds, so that the range of the 
entropy is connected. 

\medskip
{\bf THEOREM 2.11 (The relation on $\Gamma\times\Gamma$ determines the
relation everywhere)} {\it Let $\prec$ and $\prec^*$ be two relations
on the multiple scaled copies of $\Gamma$ satisfying axioms A1-A7,
and CH for $\Gamma^{(1-\lambda)}\times \Gamma^{(\lambda)}$ for each
fixed $\lambda\in[0,1]$.  If $\prec$ and $\prec^*$ coincide on
$\Gamma\times \Gamma$, i.e., $$ (X,Y) \prec (X^{\prime}, Y^{\prime})
\ \ \ {\it if \ and \  only \ if}\ \ \ (X,Y) \prec^* (X^{\prime},
Y^{\prime}) $$ for $X,X',Y,Y'\in\Gamma$, then $\prec$ and $\prec^*$
coincide on all multiple scaled copies of $\Gamma$.} \bigskip

%%%%%
As a last variation on the theme of this subsection let us note that
uniqueness of entropy does even not require knowledge of the order
relation $\prec$ on all of $\Gamma \times \Gamma$. The knowledge of
$\prec$  on a relatively thin ``diagonal" set will suffice, as Theorem
2.12 shows.  \medskip

{\bf THEOREM 2.12 (Diagonal sets determine entropy).} {\it Let
$\prec$ be an order relation on $ \Gamma \times \Gamma$ and let $S$ be
a function on  $\Gamma$ satisfying conditions (i) and (ii) of Theorem
2.10. Let ${\cal D}$ be a subset of $ \Gamma \times \Gamma$ with the
following properties:  

\item{(i)} $(X,X) \in {\cal D}$ for every $X \in  \Gamma$.  
\item{(ii)} The set 
$D= \{(S(X),S(Y))\in {\bf R}^2 \, : \, (X,Y)\in {\cal D} \}$
contains an open subset of ${\bf R}^2$ (which necessarily contains the
set $\{(x,x) : x\in {\rm Range}\, S\}$).

\medskip

Suppose now that $ \prec^*$ is another order relation on $ \ \Gamma 
\times
\Gamma$ and that $S^*$ is a function on $ \Gamma$ satisfying condition 
(i) of Theorem 2.10 with respect to $ \prec^*$ on $ \Gamma \times 
\Gamma$. 
Suppose
further, that $ \prec$ and $ \prec^*$ agree on ${\cal D}$, i.e., 
$$
(X,Y) \prec (X^{\prime}, Y^{\prime}) \ \ \ {\it if \ and \  only \ if}\ \ 
\
(X,Y) \prec^* (X^{\prime}, Y^{\prime})
$$
whenever $(X,Y)$ and $(X^{\prime}, Y^{\prime})$ are both in ${\cal D}$. 
Then $ \prec$ and $ \prec^*$ agree on all of  $\Gamma \times \Gamma$ and
hence, by Theorem 2.10, $S$ and $S^*$ are related by an affine
transformation. }

{\it Proof:} By considering points $(X,X) \in {\cal D}$, the consistency 
of
$S$ and $S^*$ implies that $S^*(X) = M(S(X))$ for all $X \in  \Gamma$, 
where
$M$ is some monotone increasing function on $L \subset {\bf R}$. Again, 
as
in the proof of Theorem 2.10, 
$$
\mfr1/2 M(S(X)) + \mfr1/2 M(S(Y)) = M\Bigl({S(X)) + S(Y)\over 2}\Bigr) 
\eqno(2.26)
$$
for all $(X,Y)\in {\cal D}$. (Note: In deriving Eq.\ (2.25) we did 
not use the fact that $  \Gamma \times \Gamma$ was the Cartesian
product of two spaces; the only thin