Emotional Learning as a New Tool for Development of Agent-based Systems

By Tony Coleman,2014-04-08 15:19
9 views 0
Emotional Learning as a New Tool for Development of Agent-based Systems

     Informatica 27 (2003) 167174 167

    Emotional Learning as a New Tool for Development of Agent-based Systems

    Mehrdad Fatourechi

    Department of Electrical and Computer Engineering,

    University of Tehran, Tehran, Iran

Caro Lucas

    Center for Excellence and Intelligent Processing, Department of Control and Electrical Engineering

    University of Tehran, Tehran, Iran

Ali Khaki Sedigh

    Department of Electrical Engineering, K.N.Toosi

    University of Tehran, Tehran, Iran

    Keywords: Intelligent control, multivariable systems, emotional learning, neurofuzzy control, agents.

Received: October 21, 2002

    A new approach for the control of dynamical systems is presented based on the agent concept. The

    control system consists of a set of neurofuzzy controllers whose weights are adapted according to

    emotional signals provided by blocks called emotional critics. Simulation results are provided for the

    control of dynamical systems with various complexities in order to show the effectiveness of the

    proposed method.

    use of other, and perhaps higher emotional cues are left 1 Introduction for future research.

     It is widely believed that decision making, even in On the other hand, in recent years, fuzzy logic has the case of human agents, should be based on full been extensively employed in the design of industrial rationality and emotional cues should be suppressed in control systems because Fuzzy controllers can work order to not influence the logic of arriving at proper fine in conditions such as severe nonlinearities, time decisions. The assumption of full rationality, however, varying parameters or plant uncertainties as supervisory has sometimes been abandoned in favor of satisficing or controllers. Also in the last decade, the intelligent bounded rationality models [1], and in recent years, the control community has paid great attention to the topic positive and important role of emotions have been of neurofuzzy control, combining the decision-making emphasized not only in psychology, but also in AI and property of fuzzy controllers and learning ability of robotics ([2]-[4]). Very briefly, emotional cues can neural networks. Hence we have chosen a neurofuzzy provide an approximate method for selecting good system as the controller in our methodology. actions when uncertainties and limitations of In the present paper, the idea of applying emotional computational resources render fully rational decision-learning [8] to the dynamic control systems using the making based on Bellman-Jacobi recursions impractical. agent concepts [10] is addressed. This paper can be In past researches ([5-9]), a very simple considered as the general framework for the previous cognitive/emotional state designated as stress has been Single-Input Single Output (SISO) works ([6]-[9]) and successfully utilized in various control applications. NSISO systems ([5]). In general, control scheme This approach is actually a special case of the popular consists of a set of agents whose tasks are to provide reinforcement learning technique. However, in this case appropriate control signals for their corresponding it is believed that since the continual assessment of the system’s input. Each agent consists of a neurofuzzy present situation in terms of overall success or failure is controller and a number of critics, which evaluate the no longer simple behaviorist type of conditioning but it outputs’ behavior of the plant and provide the is closer to the definition of cognitive state modification appropriate signals for the tuning of the controllers. and adaptation learning, the designation of emotional Simulation results for the control of the Vander Pol learning seems more appropriate. We should emphasize system (single-agent single-critic approach), a strongly

    that here emotion merely refers to stress cue, and the coupled plant with uncertainty (multi-agent multi-critic

168 Informatica 27 (2003) 167174 M. Fatourechi et al.

    approach) and the famous inverted pendulum them is that in the former the reinforcement signal is an benchmark (single-agent multi-critic approach) are analog emotional cue that represents the cognitive provided to show the effectiveness of the proposed assessment of future costs given the present state. So methodology. here the system does not wait for a total failure to occur

     The main contribution of the current paper is the before it starts learning. Instead, it continues its learning introduction of an easily implementable framework that process at the same time as it applies its control action. could lead to a controller design with little tuning effort. The resulting analog reinforcement signal constitutes We have adopted an agent-oriented approach to the stress cue, which has been interpreted as encapsulate separate concerns in multiobjective and cognitive/emotional state.

    multivariable controller design. In the next section, we’ll discuss the concept of

     The organization of this paper is as follows: The agent-based systems that will be used as the framework focus of Section 2 is on the emotional learning and how of our proposed control system.

    it can be applied in the control scheme. A brief review

    of agent concepts and how they could be used in control 3 Agent Concept and Multi-Agent applications is brought up in section 3. The structure of Systems the proposed controller and its adaptation law are The main problem of dealing with multivariable developed in section 4 and in section 5, simulation control systems is dealing with cross-coupled results are provided to clarify the matter further with the components between different inputs and outputs. In final conclusions to be addressed in section 6. other words, changing an input not only makes some changes in the corresponding output, but also influences 2 Emotional learning other outputs as well. As it will be discussed in section 4, emotional learning provides a simple useful tool in According to psychological theories, some of the dealing with such unwanted effects. The concept of this main factors of human beings’ learning are emotional method can be easily developed within the framework elements such as satisfaction and stress. Emotions can of multi-agent systems. In order to do that, in this be defined as states elicited by instrumental reinforcing section we briefly address agents and multi-agent stimuli, which if their occurrence, termination or systems. omission is made contingent upon the making of a Here we define an agent as referring to a component response, alter the course of future emission of that of software/hardware, which is capable of response [11]. accomplishing tasks on behalf of its user. By reviewing Emotions can be accounted for, as a result of the Jennings and Wooldridge’s work [14], we define an operation of a number of factors, including the agent to be any kind of object or process that exhibits following [11]: autonomy, is either reactive or deliberative, has social 1. The reinforcement contingency (e.g. whether reward ability, and can reason, plan, learn, and- or adapt its or punishment is given, or withheld). behavior in response to new situations. 2. The intensity of reinforcement

     Multi-agent systems (MASs) are systems where there 3. Any environmental stimuli might have a number of

    is no central control: the agents receive their inputs different reinforcement associations.

    from the system (and possibly from other agents as well) 4. Emotions elicited by stimuli associated with different

    and use these inputs to apply the appropriate actions. reinforcers will be different.

    The global behavior of MAS depends on the local It should also be mentioned that in this paper,

    behavior of each agent and the interactions between emotion merely refers to stress cue and other (and

    them [15]. The most important reason to use MAS perhaps higher) emotions are not considered here

    when designing a system is that some domains require it. In our proposed approach, which in a way is a cognitive

    Other aspects include: Parallelism, Robustness, restatement of reinforcement learning in a more

    Scalability and Simple Design. complex continual case (where reinforcement is also no

    longer a binary signal), there exists an element in the Based on these concepts, we have proposed an control system called emotional critic whose task is to emotion-based approach for the control of dynamic assess the present situation which has resulted from the systems, which will be discussed in the next section applied control action in terms of satisfactory achievement of the control goals and to provide the so 4. An Emotion-based Approach to called emotional signal (the stress). The controller

    should modify its characteristics so that the critic’s the Control of Dynamic Systems stress is decreased. This is the primary goal of the using Agent Concept proposed control scheme, which is similar to the learning process in the real world because in the real In this section we design an intelligent controller world, we also search for a way to lower our stress with based on the concepts considered in the previous respect to our environment ([12-13]). sections. Fig. 1 shows the proposed agent’s components As seen, emotional learning is very close to and their relation with each other based on the idea reinforcement learning, but the main difference between presented in [16]. As it can be seen, the agent is

EMOTIONAL LEARNING AS A NEW TOOL FOR… Informatica 27 (2003) 167174 169

    composed of four components. It perceives the states of In the general case of multivariable systems, each the system through its sensors and also receives some agent consists of a neurofuzzy controller. All of the information provided by other agents, then influences neurofuzzy controllers have identical structures; each the system by providing a control signal through its one has four layers. The first layer’s task is the actuator. The critics assess the behavior of the control assignment of inputs’ scaling factors in order to map

    system (i.e. criticize it) and provide the emotional them to the range of [-1, +1] (the inputs are chosen as signals for the controller. According to these emotional the error and the change of the error in the response of signals, the controller produces the control signal with the corresponding output). In the Second layer, the the help of the Learning element, which is adaptive fuzzification is performed assigning five labels for each emotional learning. Inputs of this learning element are input. For decision-making, max-product law is used in the emotional signals provided by both the agent’s layer 3. Finally, in the last layer, the crisp critics and other critics as well.

     output is calculated using Takagi- Sugeno formula [17],


    u(axbxc)Emotional ililiiliil12Sensors 1lTo other Critics yipOutput Agents uilSignals Emotional Neurofuzzy 1lfrom the Learning Controller Plant (fori1,2,;,n)From (1) other x Where x and are inputs to the controller (the ii12Agents Actuator error and the change of error of the corresponding Control Signal Agent output), i, n, u p, and y are the index of the controller, il,i

    number of controllers, l’th input of the last layer,

    number of rules in the third layer and output of the controller, respectively and a’s, b’s and c’sare ililil Fig 1. Structure of an agent in the proposed parameters to be determined via learning. methodology For each output, a critic is assigned whose task is to

     assess the control situation of the outputs and to provide

    the appropriate emotional signal. The role of the critics

    is very crucial here because eliminating unwanted Multivariable

    cross-coupled effects of the multivariable control System systems is very much dependent on the correct O1 Ooperation of these critics. Here, all the critics have the U 21 U 2same structure as of a PD fuzzy controller with five Agent1 Agent2 labels for each input and seven labels for the output.

    Inputs of each critic are error of the corresponding output and its derivative and the output is the Fig 2. Multi-agent based approach to multivariable corresponding emotional signal. Deduction is control performed by max-product law and for defuzzification,

    the centroid law is used. The number of the agents assigned here is The emotional signals provided by these critics determined based on the number of the inputs of the contribute collaboratively for updating output layer’s system. The number of the outputs of the system is learning parameters of each controller, thus the cross-effective in determining the number/structure of the coupled nature of multivariable systems is considered in system’s critics, which their role is to assess the status the critic and not in the controller itself. The aim of the of the outputs. (See Fig.2 for the schematic of the control system is to minimize of the sum of squared presented approach when applied to a two input two emotional signals. Accordingly, first we describe the and U denote the output control system where U12error function E as follows, control signals and O and O are the outputs of the 12m2system). 1 EK(r)jj2 We now develop the controller structure for the j1multivariable systems, in general. From these (2) calculations, derivation of the special case of SISO

    systems is straightforward.

170 Informatica 27 (2003) 167174 M. Fatourechi et al.

    y Co refp+ y Plant

    - Controller Diff. Co d-

    Cr p Critic

    Diff. Cr d

    Fig.3. The control loop in the case of SISO systems

     Where r is the emotional signal produced as the jrrjj output of j’s critic, Kj is the corresponding output yejjweight and m is the total number of outputs (for the

    (8) special case of SISO systems, K=1 and m=1) j Since with the incrimination of error, r (the stress of For the adjustment of controllers’ weights the

    the critic) will also be incremented and on the other steepest descent method is used,

    rEj( (i1,2,;,n)hand, on-line calculation of is accompanied with ii(eij

    (3) measurement errors, thus producing unreliable results,

     Where is the learning rate of the corresponding only the sign of it (+1) is used in our calculations. i

    From (2) to (8), will be calculated as follows, (neurofuzzy controller and n is the total number of icontrollers. mui ( ...KrJiijjji(+ i1j - Co1,Co2,Cr1,Cr2,20pdpd

     In order to calculate the RHS of (3), the chain rule is ) (i1,2,;,nandj1,2,;,m

    used, (9) mry uEEjji ((ryuijjii1j Equation (9) is used for updating the learning parameters a’s, b’s and c’s in (1), which is ililil (i=1... n and j=1, 2… m) straightforward. (4) In the next section, we’ll apply the proposed method From (2), we have, to several SISO and NSISO plants with different Eproperties in order to see the performance of the Kr(j1,2,;,m) jjproposed control methodology in practice. rj (5) 5. Simulation Results Also,

     yj In this section, the proposed method is applied to J(i1,2,;,nandj1,2,;,m) jiuicontrol three dynamical systems. The first one is the (6) highly nonlinear SISO Vander Pol system where a Where J is the element located at the ith column and single-agent approach is used. In the second one, the jijth row of the Jacobean matrix. controller is applied to a multivariable linear control Taking plant with different conditions so that its robustness in e= y-yj=1,2,…, m the presence of parameter uncertainties is shown. This j refj j

    (7) example is concerned with systems with equal number

     Where e is the error produced in the tracking of jth of inputs and outputs. In the last example, we apply our j

    output and y is the reference input (in case number of controller to the famous inverted pendulum benchmark, refj

    outputs is greater than the number of inputs, some of which is a SIMO (single-input multi-output) nonlinear ys are taken as zero as it will be cleared in the next non-minimum phase system. refjsection by the inverted pendulum example). Now we

    have, Example 1: Vander Pol Equation

    EMOTIONAL LEARNING AS A NEW TOOL FOR… Informatica 27 (2003) 167174 171

    well. The reason is obvious: it takes much more time

     Our first example discusses the control of the for the neural networks’ weights to adjust when the Vander Pol system, which is considered as a highly input of the system changes suddenly (let’s call it harsh

    nonlinear SISO system. We use a single-agent single-input) compared to a situation where a much smoother critic approach here. The equations governing this input is applied. This problem is more evident in the system are as follows: case of multivariable systems when more than one

    controller’s weights should be adjusted. Hence, when ...2applying a harsh input to a system, we’ll change it to a (1)xxxxusmooth one by pre-filtering it and obtaining a smooth yx(filtered) input for the system instead of harsh

    (unfiltered) one. The pre-filters’ specifications are (10)

    determined by the properties of the desired step In (10), u is the input, x is the single state equation

    response. The results of the simulations here show that and y is the output if the system.

    this approach, although it’s simple, is very efficient in The block diagram of the control system is shown in

    different control situations. Fig.3. The Input scaling factors and the learning rate of

     In this example, suppose that it is desired that both the control system are chosen as follows:


    Fig 4. Step response of the Example 1

     Step response of the control system is shown in Fig.4. The result shows the power of the proposed algorithm

    in the control of this nonlinear SISO system.

Example 2: Control of a plant with different


     In our second example, the problem of handling a

    multivariable plant with uncertainties is investigated. The plant has the following transfer function [18]:

    kk?1112 ~?1sA1sA 1112~?P(s) outputs have no overshoot and a rise time not more than kk2122~?1 second. Accordingly, based on a rough measure the ~?1sA1sA2122??transfer functions of pre-filters are the same and are

    chosen as follows (note that achieving more compicated (12)

    inputs requires more complicated pre-filter design It has a total of nine plant conditions as given in Table 1.

    technics which is not the topic of our discussion here): Our goal is to achieve the desired step response while output decomposition is maintained. A major problem 16H(s) encountered here is the tuning of control system’s 12s8s16coefficients in order to provide an acceptable step (13) response. It’s a time consuming task and there exists the possibility that the desired step response may never be Results of simulations are shown in Fig.5 for a step achieved. response at t=0 at the first input and another step Our Experience with this control structure shows response at t=3 at the second input (Since all the that when the change made in the input of the system is conditions nearly produced the same results, the results smooth (i.e. there are no sudden changes like applying a of simulations for three selected conditions are chosen step response in the input but instead smoother inputs to be plotted). As it is clearly obvious, the change of like sinusoids are applied) the control system acts very

    172 Informatica 27 (2003) 167174 M. Fatourechi et al. plant conditions has little or almost no effect in the step the pendulum is balanced and the cart has no velocity).

    responses of the system, i.e., system shows great = 10 The results of simulation for initial condition 0robustness in the presence of uncertainties. Comparing deg. are presented in Fig. 7. They show that after the results with those obtained by classical methods nearly six seconds the pole is balanced and the cart is such as the one in [18], shows the superiority of the stopped successfully around 1.4 meters from the proposed algorithm. original position. Although we’ve achieved good step responses and great robustness, but we should take another important aspect into notice and that’s the interaction in this system, which is not high. Interaction is the major drawback in the design of multivariable systems because it introduces unwanted effects from different inputs in the outputs of the system. The more is the interaction in the systems, the more complex the control approach will be.

     In order to show that our proposed controller can tolerate bigger parameter changes, which yield situations with high interaction, we added 6 more

    conditions to the previous ones (Table 2). The results of applying the controller are shown in Fig.6 for two selected conditions. As we can see, our method also shows great robustness to parameter uncertainties in the presence of high interactions.

Example 3: An Inverted Pendulum:

     The problem of balancing an inverted pendulum on a moving cart is a good example of a challenging multivariable situation, due to its highly nonlinear equations, non-minimum phase characteristics and the problem of handling two outputs with only one control input [19] (the position of cart is sometimes ignored by the researchers [20]). Here, the dynamics of the inverted pendulum are characterized by four variables: (angle

    of the pole with respect to the vertical axis), (angular

    zvelocity of the pole), (position of the cart on the .Fig.5. Simulation results of example 2. 5(a) condition 1; ztrack), and (velocity of the cart). The behavior of

    5(b) condition (4); 5(c) condition (9) these state variables is governed by the following two

     second-order differential equations [17]:

    2.FmlSin***g;SinCos*()mmc (14) 2mCos*4l*()3mmc

    2.....Fm*l*(*Sin*Cos)z (15) mmc

    m Where g (acceleration due to gravity) is 9.8, m c2s

    (mass of cart) is 1.0 kg, l (half-length of pole) is 0.5 m,

    and F is the applied force in Newton. Our control goal is to balance the cart, yet keep the z not further than 2.5

    meters from its original position. We use a single agent here, which provides the force F to the system and applies two emotional critics to assess the output. The Fig.6. Simulation results of example 2. 6(a) condition first one criticizes the situation of the pole and the 10; 6(b) condition (12). second one does the same for the cart’s velocity. Both critics are satisfied when inputs to them are zero (i.e.

    EMOTIONAL LEARNING AS A NEW TOOL FOR… Informatica 27 (2003) 167174 173

    the advantages and the shortcomings of the current

    method briefly which will be followed by description of

    future works.

    6.2 Relationship between the Proposed

    Framework and Agent-Based Systems

     In this paper a major consideration has been

    distributing control concerns via agents. Each agent is Fig. 7. Responses of variables of example 1 (from left used for representation of a control concern. In this to right: pole’s angle and cart’s position)paper we have used the notion of agency only in conceptual sense and no effort has been made towards

    utilization of agent-oriented technologies like ACL’s, 6. Discussions and Conclusions platforms, wrappers, etc. However, those technologies can be of benefit in future, more complex applications. In this section we discuss the general properties of the The main agent property in our paper is autonomy. Our proposed framework and we’ll summarize the work that agents can be both interpreted as deliberative as well as has been done in this paper. reactive (since emotion is a mental state, but is also very

     close to the concept of reinforcement), and learning,

    reasoning, and adaptation is central to our proposed 6.1. The role of emotional signals in the controller. Other benefits of agent orientation can also proposed control scheme be seen to be applicable. The proposed methodology is based on continuous 6.3 Conclusions and Future Works emotional (stress) signals, which can be considered as performance measures of particular parts of the control In this paper, the emotional learning based intelligent system, which are of interest. In this paper, the parts control scheme was applied to dynamic plants. Also the that we’re interested in, are the outputs of the control performance of the proposed algorithm was investigated system and the cross-coupled components in the by several benchmark examples. The main contribution multivariable systems. In each part, the nearer we got to of the proposed generalization is to provide the easy to our predefined target, the less is the corresponding implement emotional learning technique for dealing emotional signal and vise versa. With this simple with dynamic (especially multivariable) control systems approach, we can easily include any parts of the plant where the use of other control methodologies (specially on which we want to have a control on it, in our intelligent control methods) are sometimes problematic framework. For example, for excluding the effect of ([21]). Simplicity and tolerance for uncertainties and cross-coupled components in multivariable systems, nonlinearities is what is gained by its use. This is shown we’ll assign a critic for each component. This critic in various contributions for SISO and NSISO systems would judge whether the control system has in this paper. counterbalanced the cross-coupled effects or not. Based On the negative side, it should be pointed out that on the success of the controller in dealing with only a very simple learning algorithm has been used interaction, emotional signals are produced by the throughout this paper. Although this stresses the critics who in their turn would tune the parameters of simplicity and generality of the proposed technique, the neurofuzzy controller so that the stress of the critics more complex learning algorithms involving time credit would be decreased. assignments [22] and temporal difference [23] or The same situation also holds for the inverted similar methods might be called for when processes pendulum example in the previous section. The main involve unknown delays. Also as the number of the variable that is of interest is the position of the inputs and outputs of the system grows, the tuning of pendulum regards to the vertical axis, but at the same the control parameters becomes a tiresome task. A time the position of the cart is also of interest here as continuous genetic algorithm based optimization the secondary control variable. Hence the inputs of the method is under development to find the optimal neurofuzzy controller are the error and the derivative of selection of the tuning parameters of the overall control the angle of the pole with respect to the vertical axis but system automatically. Having this done, generalization the weights of the controller are tuned based on the to systems with multiple numbers of inputs and outputs outputs of two critics; the first one criticizes the position (more than two) can be realized efficiently. of the pole and the second one does the same for the Other works include the application of multiple position of the cart. Both critics produce continuous critics in a SISO plant in order to achieve multiple signals until their inputs are zero, i.e. when the objectives ([6-7]). For example in [6], two objectives predefined targets are achieved. (good tracking and low control costs) are considered Next we’ll discuss how our framework is related to simultaneously. This reference shows the difference agents and multi-agent systems and then we’ll discuss between our approach and supervised learning in a

174 Informatica 27 (2003) 167174 M. Fatourechi et al.

    clearer manner, because it shows that our proposed [11] E. D. Rolls (1998), the Brain and Emotion, Oxford

    methodology can perform well not only perform well in University Press.

    the case of cheap control, but also where control action [12] K.M. Galloti (1999), Cognitive psychology in and also involves costs. Also implementing such control out of laboratory (2nd ed.), Brooks/Cole, Pacific Grove, system for a switched reluctance motor (as a practical CA.

    system) is under investigation. Again, agent orientation [13] R.W. Kentridge and J.P. Aggleton (1990), Emotion: can underline the fact that each objective can be Sensory representations, reinforcement and the considered a separate concern delegated to an agent temporal lobe, Cognition and Emotion 4, pp.191-208.

     Our future works include changing the structure of the [14] M. Wooldridge and N. Jennings (1995), Intelligent controller so that it could be applied to processes with Agents: Theory and Practice, The Knowledge

    unknown delays, considering more emotions in our Engineering Review, 10 (2), pp.115-152.

    control structure, optimizing the structure of the [15] M. Wooldridge (1999), Intelligent agents, in G. controller (for example using genetic algorithm for Weiss (Ed.), Multi agent Systems: A modern approach optimum selection of the membership functions of both to Distributed Artificial Intelligence, MIT Press,

    the controllers and the critics), and finally considering London, pp.27-77.

    more complex cues in our learning process. [16] S. Russel and P. Norwig (1995), a modern

     approach to artificial intelligence, Prentice-Hall,

    Englewood Cliffs. Acknowledgement [17] T. Takagi and M. Sugeno (1983), Derivation of fuzzy control rules from human operator’s control The authors would like to thank the two anonymous actions, IFAC Symp.on Fuzzy Information, Knowledge referees for their valuable comments. Representation and Decision Analysis, pp.55-60. [18] C.C. Cheng, Y.K. Liao, T.S. Wanq (1997), References Quantitative design of uncertain multivariable control

     system with an inner-feedback loop, IEE Proceedings

    [1] A. H. Simon and Associates (1987), Decision on Control Theory Applications, no.144, pp.195-201. making and problem solving, Interfaces, no.17. [19] R. H. Cannon (1967), Dynamics of Physical [2] C. Balkenius and J. Moren (2000), A Computational Systems, McGraw-Hill, New York.

    Model of Context Processing, 6th International [20] J. S. Jang (1992), Self-learning fuzzy controllers Conference on Simulation of Adaptive Behavior, based on temporal back propagation, IEEE

    Cambridge. Transactions on Neural Networks, 3(5), pp. 714-723. [3] M.El-Nasr, T. Loerger, and J.Yen (1999), Peteei: A [21] P.G. Lee, K.K. Lee and G.J. Jeon (1995), An index Pet with Evolving Emotional Intelligence, Autonomous of applicability for the decomposition method of Agents99, pp.9-15. multivariable fuzzy systems, IEEE Transactions on [4] J. Velasquez (1998), a Computational Framework Fuzzy Systems, no. 3, pp. 364-369.

    for Emotion-Based Control, the Grounding Emotions in [22] R. S. Sutton, A. G. Barto (1987), A Temporal thAdaptive Systems Workshop SAB’98, and Zurich, Annual Difference Model of Classical Conditioning, 9

    Switzerland. Conference on Cognitive Science, New Jersey, pp.355-[5] M. Fatourechi, C. Lucas, and A. Khaki Sedigh 378.

    (2001), An Agent-based Approach to Multivariable [23] R. S. Sutton (1988), Learning to Predict by the Control, IASTED International Conference on Artificial Method of Temporal Differences, Machine Learning,

    Intelligence and Applications, Marbella, Spain, pp.376-no.3, pp.9-44.


     [6] M. Fatourechi, C. Lucas and A. Khaki Sedigh

    (2001), Reducing Control Effort by means of Emotional

    Learning, 9th Iranian Conference on Electrical

    Engineering (ICEE2001), Tehran, Iran, pp.41-1 to 41-8.

    [7] M. Fatourechi, C. Lucas and A. Khaki Sedigh

    (2001), Reduction of Maximum Overshoot by means of thEmotional Learning, 6 Annual CSI Computer

    Conference, Isfahan, Iran, pp.460-467.

    [8] C. Lucas and S.A. Jazbi (1998), Intelligent motion

    control of electric motors with evaluative feedback,

    Cigre98, Cigre, France, 11-104, pp.1-6.

    [9] C. Lucas, S.A. Jazbi, M. Fatourechi, M. Farshad

    (2000), Cognitive action selection with neurocontrollers,

    Third Iran-Armenia Workshop on Neural Networks,

    Yerevan, Armenia.

     [10] P. Maes (ed.) (1991), Designing autonomous

    agents: theory and practice from biology to engineering

    and back, The MIT press, London.

Report this document

For any questions or suggestions please email