DOC
# A LOGICAL MODEL OF GENETIC ACTIVITIES IN LUKASIEWICZ ALGEBRAS THE

7 views
0

Copyright?I.C. Baianu,2004

COMPLEX SYSTEMS ANALYSIS OF CELL CYCLING MODELS

IN CARCINOGENESIS: II.

Cell Genome and Interactome, Non-random, Nonlinear Dynamic Models in

Łukasiewicz Logic Algebras-- Neoplastic Transformation Models in

Łukasiewicz-Topos with Pseudo-Markov Chain Processes representing

Progression Stages during Carcinogenesis and Cancer Therapy

I. C. Baianu University of Illinois at Urbana, Urbana, IL 61801, USA email: ibaianu@uiuc.edu ABSTRACT

Carcinogenesis is a complex process that involves dynamically inter-connected modular sub-

networks that evolve under the influence of micro-environmental, as well as, in many cases,

cancer therapy-induced perturbations, in non-random, pseudo-Markov chain processes. An

appropriate n-stage model of carcinogenesis involves therefore n-valued Logic treatments of

such processes, nonlinear dynamic transformations of complex functional genomes and cell

interactomes. Lukasiewicz Algebraic Logic models of genetic networks and signaling

pathways in cells are formulated in terms of nonlinear dynamic systems with n-state

components that allow for the generalization of previous, Boolean or "fuzzy", logic models of

genetic activities in vivo. Such models are then applied to cell transformations during

carcinogenesis based on very extensive genomic transcription and translation data from the

CGAP databases supported by NCI. Inter-related signaling pathways include very large

numbers of different biomolecules, such as proteins, in the intercellular, membrane, cytosolic,

nuclear and nucleolar compartments. One such family of pathways contains the cell cyclins.

Cyclins are proteins that link several critical pro-apoptotic and other cell cycling/division

components, including the tumor suppressor gene TP53 and its product, the Thomsen-

Friedenreich antigen (T antigen), Rb, mdm2, c-Myc, p21, p27, Bax, Bad and Bcl-2, which all

play major roles in carcinogenesis of many cancers. A categorical and Lukasiewicz-Topos (LT)

framework for Lukasiewicz Algebraic Logic models of nonlinear dynamics in complex

functional genomes and cell interactomes. An algebraic formulation of varying 'next-state

functions' is extended to a Łukasiewicz Topos with an n-valued Łukasiewicz Algebraic Logics

subobject classifier description that represents non-random and nonlinear network activities

as well as their transformations in developmental processes and carcinogeness. Specific models

for different types of cancer are then derived from representations of the dynamic state-space

of LT non-random, pseudo-Markov chain process, network models in terms of cDNA and

proteomic, high throughput analyses by ultra-sensitive techniques. This novel theoretical

analysis is based on extensive CGAP genomic data for human tumors , as well as recently

published studies of cyclin signaling, with special emphasis placed on the roles of cyclins D1

and E. Several such specific models suggest novel clinical trials and rational therapies of

cancer through re-establishment of cell cycling inhibition in stage III cancers.

1. Introduction.

The calculus of predicates of formal Hilbert Logic was applied by Nicolas Rashevsky (1965) to generate organismic set theories based on relational biology. Lőfgren (1968) subsequently

introduced a quite different, more general logical approach than that of Rashevsky‘s predicate

calculus to the problem of self-reproduction. An attempt to provide a characterization of genetic

activities in terms of n-valued logics and generalized biodynamic state-spaces is introduced. For

operational reasons the model is directly formulated in an algebraic form by means of Łukasiewicz

algebras. The Łukasiewicz algebras were introduced by Moisil (1940) as algebraic models of n-

valued logics, and further improvements were made by utilizing categorical constructions of

Łukasiewicz Logic algebras (Georgescu and Vraciu, 1970).

Had the structural genes presented an "all-or-none" type of response to the action of regulatory

genes, a very simple neural nets might have some partial dynamical analogy to a correspondingly

small genetic network. Then, both types of net would be only two distinct realizations of a net which

is built up of two-factor elements (Rosen, 1970). This would allow for a detailed dynamica1 analysis

of their action (Rosen, 1970). However, the case which we consider first is the one in which the

activity of the genes is not necessarily of the "all-or-none" type. Nevertheless, the representation of

elements of a net (in our case these are genes, operons, or groups of genes), as black boxes is

convenient for formal reasons, and will be maintained in the sequel (see Figure 1).

2. Nonlinear Dynamics in Non-Random Genetic Network Models in Łukasiewicz Logic

Algebras.

Jacob and Monod (1961) have shown, that in E. Coli the "regulator gene" and three "structural

genes" concerned with lactose metabolism lie near one another in the same region of the

chromosome. Another special region near one of the structural genes has the capacity of responding

to the regulator gene, and it is called the "operator gene". The three structural genes are under the

control of the same operator and the entire aggregate of genes represents a functional unit or

"operon". The presence of this "clustering" of genes seems to be doubtful in the case of higher

organisms, and therefore, more complex networks of genes and genetic network modules are now

being extensively studied in conjunction with genomic and proteomic data analysis for medical-

oriented purposes, such as individualized cancer therapy development. Carcinogenesis is a complex

process that involves dynamically inter-connected modular sub-networks that evolve under the

influence of micro-environmental, as well as, in many cases, cancer therapy-induced perturbations,

in non-random, pseudo-Markov chain processes. An appropriate n-stage model of carcinogenesis

involves therefore n-valued Logic treatments of such processes, nonlinear dynamic transformations

of complex functional genomes and cell interactomes. Lukasiewicz Algebraic Logic models of

genetic networks and signaling pathways in cells are formulated in terms of nonlinear dynamic

systems with n-state components that allow for the generalization of previous, Boolean or "fuzzy",

logic models of genetic activities in vivo. Such models are then applied to cell transformations

during carcinogenesis based on very extensive genomic transcription and translation data from the

CGAP databases supported by NCI. Inter-related signaling pathways include very large numbers of

different biomolecules, such as proteins, in the intercellular, membrane, cytosolic, nuclear and

nucleolar compartments. One such family of pathways contains the cell cyclins. A detailed model of

cell cyclins in carcinogenesis was recently presented (Baianu and Prisecaru, 2004; arXiv.q-

bio.OT/0406046 Preprint). Thus, it would be natural to term any assembly, or aggregate, of

interacting genes as a genetic network, without considering the 'clustering' of genes as a necessary

condition for all biological organisms. Had the structural genes presented an "all-or-none" type of

response to the action of regulatory genes, the neural nets might be considered to be dynamically

analogous to the corresponding genetic networks, especially since the former also have coupled ,

intra-neuronal signaling pathways resembling-but distinct- from those of other types of cells in

higher organisms. In a broad sense, both types of network could be considered as two distinct

realizations of a network which is built up of two-factor elements (Rosen, 1970). This allows for a

detailed dynamica1 analysis of their action (Rosen, 1970). However, the case that was considered

first as being the more suitable alternative (Baianu, 1977) is the one in which the activities of the

genes are not necessarily of the "all-or-none" type. Nevertheless, the representation of elements of a

net (in our case these are genes, operons, or groups of genes), as black boxes is convenient, and is

here retained to keep the presentation both simple and intuitive (see Figure 1). Previously, the

assumption was made (Baianu,1977) that certain genetic activities have n levels of intensity, and this assumption is justified both by the existence of epigenetic controls, as well as by the coupling of

the genome to the rest of the cell through specific signaling pathways that are involved in the

modulation of both translation and transcription control processes. This model is a description of genetic activities in terms of n-valued Łukasiewicz logics. For operational reasons the model is directly formulated in an algebraic form by means of Łukasiewicz Logic algebras. The formalization of genetic networks that was introduced previously (Baianu, 1977) in terms of Łukasiewicz Logic, and the appropriate definitions are here briefly recalled in order to maintain a self-contained

presentation.

The genetic network presented in Figure 1 is a discriminating network (Rosen, 1970). Let us

consider the system in Figure 1b and apply to it a simple system formalization that converts the

system into logical expressions with n-values. The level (chemical concentration) of P1. is zero when

the operon A is inactive, and it will take some definite non-zero values on levels ‗1‘, ‗2‘, and ‗(n-1)',

otherwise. The first of A is obtained for a threshold value

δ of P2-that corresponds to a certain level

of 'j' of B. Similarly', the other corresponding thresholds for levels 1,2,3,... and'(n-1)' are, respectively, AAA A u.:,. U. u. u.. The thresholds are indicated inside the black boxes, in a sequential order, as 122n-1

shown in Figure 2. Thus, if A is inactive (that is, on the zero level), then B will be active on the k level

which is characterized by certain concentration of P. Symbolically, we write: A(t;0) .= . B(t+δ ,k), 2

where t denotes time and δ is the ‗time lag‘ or delay after which the inactivity of A is reflected in to

the activity of B, on the k level. Similarly, one has:

A(t‘ + ε,n-1).. B(t‘;0).

The levels of A and B, as well as the time lags δ and ε, need not be the same, More complicated

situations arise when there are many concomitant actions on the same gene. These situations are

analogous to a neuron with alterable synapses. Such complex situations could arise through

interactions which belong to distinct metabolic pathways.

RGI OI SGI

(a)

P1 P2

RG2 O2 SG2

A B

E1 P 2R G SG O1

(b)

Pi E1 R G2 SG2 O2

SS1 2

Figure 1. The simplest control unit in genetic net and its corresponding black-box images.

The levels of A and B, as well as the time lags δ and ε, need not be the same, More complicated

situations arise when there are many concomitant actions on the same gene. These situations are

analogous to a neuron with alterable synapses. Such complex situations could arise through

interactions which belong to distinct metabolic pathways. In order to be able to deal with any

particular situation of this type one needs the symbols of n-valued logics. Re-label the last (n-1)

level of a gene by 1. An intermediate level of the same gene should be then relabeled by a lower

case letter, x or y. The zero level will be labeled by '0', as before. Assume that the levels of all

other genes can be represented by intermediate levels. (It is only a convenient convention and it

does not impose any further restriction on the number of situations which could arise). With all

assertions of the type ―gene A is active on the i-th level and gene B is active on the j-th level‖ one can form a distributive lattice, L. The composition laws for the lattice will be denoted by ? and ?.

The symbol ? stand for the logical non-exclusive 'or', and ? will stand for the logical

conjunction symbol 'and'.

A B

AAABBB μo, μ1, …, μ (n-1) μ, μ1, …, μ (n-1) o P 2

SS 1 2

Figure 2. Black-boxes with n levels of activity.

The levels of A and B, as well as the time lags δ and ε, need not be the same, More complicated

situations arise when there are many concomitant actions on the same gene. These situations are

analogous to a neuron with alterable synapses. Such complex situations could arise through

interactions which belong to distinct metabolic pathways. In order to be able to deal with any

particular situation of this type one needs the symbols of n-valued logics. Re-label the last (n-1)

level of a gene by 1. An intermediate level of the same gene should be then relabeled by a lower

case letter, x or y. The zero level will be labeled by '0', as before. Assume that the levels of all

other genes can be represented by intermediate levels. (It is only a convenient convention and it

does not impose any further restriction on the number of situations which could arise). With all

assertions of the type ―gene A is active on the i-th level and gene B is active on the j-th level‖ one

can form a distributive lattice, L. The composition laws for the lattice will be denoted by ? and ?.

The symbol ? will stand for the logical non-exclusive 'or', and ? will stand for the logical conjunction 'and'.

Another symbol "

" allows for the ordering of the levels and is the canonical ordering of ?

the lattice. Then, one is able to give a symbolic characterization of the dynamics of a gene of the

not with respect to each level i. This is achieved by means of the maps δ: L?L and N: L?L, t(with N being the negation). The necessary logical restrictions on the actions of these maps lead

to an n-valued Łukasiewicz algebra.

(I) There is a map N: L ?L, so that N(N(X))= X, N(X ?Y) = N(X) ?N(Y) and N(X?Y) = N(X) ?N(Y), for any X, Y ? L.

(II) there are (n-1) maps δi: L?L which have the following properties (a) δi(0) =0, δi(1) =1, for any i=1,2,….n-1;

(b) δi(X ? Y) = δ(X) ? δi(Y), δi(X ?Y) = δi (X) ? δi(Y), for any X, Y? L, and i=1,2,…, n-1; (c) δi(X) ? N(δi(X)) = 1, δi(X) ? N (δi(X)) = 0, for any X ? L;

(d) δi(X) ? δ2(X) ? …? δn-1(X) , for any X ? L;

(e) δh*δk =δk for h, k =1, …, n-1;

(f) I f δi(X) =δi(Y) for any i=1,2,…, n-1, then X=Y;

(g) δi (N(X))= N(δj(X)), for i+j =n.

(Georgescu and Vraciu, 1970).

The first axiom states that the double negation has no effect on any assertion concerning any level,

and that a simple negation changes the disjunction into conjunction and conversely. The second

axiom presets in the fact ten sub cases which are summarized in equations (a) –(g). Sub case (IIa) states that the dynamics of the genetic net is such that it maintains the genes structurally unchanged. It

does not allow for mutations which would alter the lowest and 'the highest levels of activities if the

genetic net, and which would, in fact, change the whole net. Thus, maps δ: L?L are chosen to

represent the dynamical behavior of the genetic nets in the absence of mutations.

Equation (IIb) shows that the maps δ maintain the type of conjunction and disjunction.

Equations (IIc) are chosen to represent assertions of the following type.

sentence "a gene is inactive on the i-th level and it is inactive on the same level" is always false>.

Equation (IId) actually defines the actions of maps δt. Thus, "I is chosen to represent a change from a

certain level to a level as low as possible, just above the zero level of L. δ2 carries a certain level x in assertion X just above the same level in δ 1(X) δ 3 carries the level x-which is present in assertion X-

just above the corresponding level in δ 2(X), and so on.

Equation (IIe) gives the rule of composition for maps δt.

Equation (IIf) states that any two assertions which have equal images under all maps δ t, are equal. Equation (IIg) states that the application of δi to the negation of proposition X leads to the negation of proposition δ (X), if i+j = n-1.

The behavior of a genetic network can also be intuitively pictured by n table with k columns, corresponding to the genes of the net, and with rows corresponding to the moments which are

counted backwards from the present moment p. The positions in the table are filled with 0's, l's and

letters i,j, . . ., (n-1) which stand for levels in the activity of genes. Thus,

1 denotes the i-th gene maximal activity. For example, with k = 3, the table might be as in Table I.

Table I. A table representation of the behavior of the particular genetic net

Time A B C P i 0 .1

P-ε k 0 1 P-δ 1 1 0 …

The 0 in the first row and the first column means that gene A is inactive at time p; the 1 in the first

row and second column means that C is active on the i-th level of intensity, at the same moment. In

order to characterized mutations of genetics networks one has to consider mappings on n-valued

Lukasiewicz algebras. These lead, in turn, to categories of genetic networks that contain all such ?L is called a morphism of Łukasiewicz algebras if it has the following 12networks together with all of their possible transformations and mutations.

properties: (D2) A mapping f: L

(M1). f(0)=0 , f(1)=1, f*N= N*f;

(M2). F(XUY)=f(X)Uf(Y); f(X?Y)= f(X) ? f(Y), for any X,Y Є L;

(M3). f*d =d*f, for any y=0,1,2,…,n-1.

The totality of mutations of genetic nets is then represented by a subcategory of Luk

– the category n

of n-valued Łukasiewicz algebras and morphisms among these, as discussed next in Section 3. A

special case of n-valued Łukasiewicz algebras is that of centered Łukasiewicz algebras, that is, these

algebras in which there exist (n-2) elements a, a,….a ε : (called centers), such that: 12n

0, for 1 < j < n-j

δ (a) = { j

1, for n-j < i < n-2.

If the activity of genes would be of the ―all or none‖ type then we would have to consider genetic nets

as represented by Boolean algebra. A subcategory of B, the category of Boolean algebras, would 1

then be represented by the totality of mutations of ―all or none‖ type of genes. However, there exists

equivalence between the category of centered Lukasiewicz algebras.

C D

This equivalence is expressed by two adjoint functors Lukn ----Bl-----Lukn with C being full and faithful (Georgescu and Vraciu, 1970). The above algebraic result shows that

the particular case n=2 (that is ―all or none‖ response) can be treated by means of centered

Łukasiewicz algebras.

1. Categories and Topoi of Genetic Networks

Let us consider next categories of genetic networks that are collections of such networks and their

functional transformations. These are in fact subcategories of Lukthe category of Łukasiewicz n,

Logic Algebras and their connecting ―morphisms‖. The totality of the genes present in a given

organism—or a genome-can thus be represented as an object in the associated category of genetic

networks of that organism. Let us denote this category of genetic networks by N, and call it the

genetic transformation category. There exists a genetic network and its associated

transformations in N that corresponds to the fertilized ovum form which the organism developed.

This genetic net will be denoted by 0, or G. o

Theorem 1. The Category N of Genetic Networks of any organism has a projective limit.

Proof. To prove this theorem is to give an explicit construction of the genetic net which realizes the

projective limit. If G, G,…,G are distinct genetic nets, corresponding to different stages of 12i

development of a. certain organism, then let us define the cartesian product of the last (l-1) genetic

nets ?Gj as the product of the underlying lattices L, L…, L. Correspondingly, we have now (l-1) 23ptuples are formed with the sentences present in L, L,…L, as members. 23pThe theorem is proven by the commutativity of the diagram

l

?Gj

j=2

G G mk

for any G and G in the sequence G, G,…..Gsuch that m>k. The commutativity of this diagram km23i

is compatible with conditions (M1), (M2) and (M3) that define morphisms of lattices. Moreover,

l

Gi= ЏGi

i=0

and one also has that G=0 . Q.E.D. i

This result shows that the genetic network corresponding to a fertilized ovum is the projective limit

of all subsequent genetic networks-corresponding to later stages of development of that organism.

Such an important algebraic property represents the ‗potentialities for development of a fertilized

ovum‘.

Theorem 2. Any family of Genetic Networks of N has a direct sum, and also a cokernel exists

in N.

The proof is immediate and stems from the categorical definitions of direct sum and

cokernel (Mitchell,1965; and Baianu, 1970, 1977, in the context of organismic models). The above

two theorems show a dominant feature of the category of genetic nets. The algebraic properties of

N are similar to those exhibited by the category of all automata (sequential machines), and by its

subcategory of (M, R)-systems, MR (for details see theorems 1 and 2, Baianu, 1973).

Furthermore, Theorems 1 and 2 hint at a more fundamental conjecture stating that: ―There

exist adjoint functors (Baianu,1970) between the category of genetic networks described here and

the category of (M,R)-systems characterized previously (Theorems 1 and 2 of Baianu, 1977, and , categories that could be constructed explicitely for specific equivalent classes of nBaianu,1973, respectively); there are also certain Kan extensions of the (M,R)-systems category in (M,R)-systems and their underlying, adjoint genetic networks‖. Such Kan extensions may be

the N, and Lukrestricted to the subcategory of centered Łukasiewicz Logic Algebras and their Boolean-compatible dynamic transformations of (M,R)-systems, with the latter as defined by Rosen (1971, 1973).

4. Realizability of Genetic Networks.

The genes in a given network G will be relabeled in this section by g,g,g,……g. The 123N

peripheral genes of G are defined as the genes of G which are not influenced by the activity of other

genes, and that in their turn do not influence more than one gene by their activity. Such genes have

connectivities that are very similar to those present in random genetic networks, and could be

presumably studied in Łukasiewicz Logic extensions of random genetic networks, rather than in

strictly Boolean logic nets. The intermediate case of centered Łukasiewicz Algebra models of

random genetic networks will thus provide a seamless link between various type of logic-based

random networks, and also to Bayesian analysis of simpler organism genomes, such as that of yeast,

and possibly Archeas also.

The assertion A(t;0) in (1) is called the action of gene g. The predicates which define the activities A

of genes comprise their syntactical class. As in the formalization inspired by McCullouch and Pitts, a

solution of G will be a class of sentences of the form:

S: A(z1). ? . Pri(A,B,…, N;Z) tp+1pn

with Pr being a predicate expression which contains no free variable save z, and such that S has one i1t

of the values of this Łukasiewicz n-Logic, except zero. The functor S is defined by the two following

equalities:

S(P)(t;k).?. P(kx) .t = x k

2k Pr =S(S(Pr)),…,S(Pr)=S(S(…(S(Pr)))

? k-times

SmGiven a predicate expression S(Pr1)(P,…,Pp,z1) , with m a natural number and s a constant 1

sequence, then it is said to be realizable if there exists a genetic, or neural, network G and a series of activities such that :

A1(z1) = Pr1(A1,A2,…,z1,s) a1

has a non-zero logical value for s= A(0). Here the realizing gene will be denoted by g p1.a1

Two laws concerning the activities of the genes, which are such that every S which is

realizable for one of them is also realizable for the other, will be called equivalent. Equivalent genes may have additional algebraic structures in terms of topological grupoids (that is, categories consisting of topological space isomorphisms; Ehresmann, 1956; Brown, 1975) and subcategories of

Lukn that contain such topological grupoids of equivalent genes, TopGd.

A genetic network will be called cyclic if each gene of the net is arranged in a functional chain

with the same beginning and end. In a cyclic net each gene acts on its next neighbor and is

influenced by its precedent neighbor. If a set of genes g

, g, g, …, g of the genetic net G is such 123pthat its removal from G leaves G without cycles, and if no proper subset has this property, then the

set is called cyclic. The cardinality of this set is an index on the complexity of its behavior. It will be

seen later that this index does not uniquely determine the complexity of behavior of a genetic

network. Furthermore, such cyclic subnetworks of the genome may have additional algebraic

structure that can be characterized by a certain type of algebraic groups that will be called genetic

groups, and will be forming a Category of Genetic Groups, GrG, with group transformations as group morphisms. GrG is obviously a subcategory of N, the genetic network transformation category,

or the category of time-dependent genomes. In its turn, the category N is a subcategory of the higher

order Cell Interactome category, IntC, that includes all signaling pathways coupled to the genetic

networks, as well as their dynamic transformations and other metabolic components and processes

essential to cell survival, growth, development, division and differentiation.

There is, therefore, in terms of the organizational hierarchy and complexity indices of the various

categories of networks

the following partial, and strict, ordering:

Automata Semigroup Category (ASG) < MR < CtrLukn < GrG < TopGd < IntC <Lukn

This sequence of network structure models forms a finite, organizational semi-lattice of subcategories

of network models in Lukn. Their classification can be effectively carried out by selecting the

Łukasiewicz Logic Algebras as the subobject classifier in a Łukasiewicz Logic Algebras Topos

(Baianu et al, 2004) that includes the cartesian closed category (Baianu,1973) of all networks that has

limits and colimits. A particularly interesting example is that of the TopGd category that will contribute certain associated sheaves of genetic networks with striking, ‗emerging‘ properties such as

‗genetic memory‘ that perhaps reflects underlying holonomic quantum genetic proceeses, as well as