THIS IS UNDER REVIEW AND MAY NOT BE USED FOR WINTER 2010

Updates:

Now points to new version of HostServer
Noted that integrating the animal program might come later

Program Four Intelligent Distributed Agent

You are required to either complete the agent program, OR the agent Agent Paper . In general, running programs will be more favorably considered than papers, because they are harder, and more time-consuming. An outstanding paper is roughly equivalent to a good DIA program.

Submit the paper to the Final-Paper COL link. Submit the DIA program to the DIA COL link. Pick exacly ONE link for submission.

General overview:

You must provide a DIAgent-readme.html file, in the correct format. THIS IS IMPERATIVE.

If you want credit for your development work, which I will give you, you MUST include DIAgent-dev-log.html file in the correct format. This is a simple daily line-item file listing what you did, on what date, and how long you spent, working on this project.

Your code must have extensive internal documentation: header comments, comments for the methods, comments for the classes.

This assignment is designed to make some of the material in the book less academic. By designing, and to some extent implementing, a simple distributed system that yet embodies some hard problems, our goal is to synthesize concepts from the book in a down-to-earth way, and to write in a meaningful way about the distributed systems problems embodied in our own architecture.

In this assignment you will build a distributed application that implements a framework for an "intelligent" agent that runs on multiple nodes of a network. The agent learns through interaction with the user, and this learning is reflected, generally in real time, at remote locations.

In the basic running of this collection of agents, users sit at terminals and play the "animal" game; as they do so the agent learns about new animals. One agent interacts with one user; many agents can run at once. But agents can also interact with other agents, ship themselves to other machines in ways possibly transparent to users, and so on. Central to our implementation is the idea that agents can co-operate, possibly in groups, to share knowledge about what they have learned from users.

Refer to (and run) the simple Web Animal Java program I wrote to see how the basic animal game is played, and to get some suggestions on how you might implement your game. There are many sophisticated implementations of the game, in many programming languages, and you might want to try some out for fun and for some ideas. E.g., one such is: Animal.java

Administration:

Files submitted to COL:
- checklist-agent.html
- DIAgent.java - The source file for your intelligent agent WITH COMMENTS!
- DIAgent-readme.html - strict format, everything we need to know
- DIAgent-dev-log.html - get credit for your work
- Other files as needed, be sure to document in DIAgent-readme.html
Copy the checklist for this programming assignment. Update it as you make progress.
Refer to your JokeServer program and also your MyWebserver program for large hints on how to get this program working, and see below for a simple web interface.
READ and POST at the agent newsgroup. I will consider your timely participation at the DIA newsgroup when assessing your grade for this assignment. Newsgroups have useful academic discussions, and also interesting tips. Additionally they get you in the habit of submitting your ideas for review / discussion to a community of your peers.

Basic DIA assignment.

In the simplest form your DIA assignment will...

use the HostServer base code (provided as PDF, see below).
listen for connections at a master port of 2525 and return further instructions to a web client which connects to that port.
Write code to support multiple simple agents participating together with the hostserver. Start with a simple application that simply collects a name, and a number from the user. Later this will be modified to playing the animal game over the web.
Allow agents to migrate to different ports under the same hostserver, but retain the state of users interacting with that agent.
Require the agents to belong to one of at least two groups , which are assigned when an agent is instantiated. Use city names for group names such as Baltimore Group or Bombay group .
Require that agents post their group membership on all the web pages they return, along with a color that is currently active for all members of that group.
Implement a set of HTML radio buttons for all agents so that "colors" can be set for that agent; the color is also changed for all members of the agent's group. (That is, if agents A, B, and C are part of group Baltimore, and X, Y and Z are members of group Bombay, then if at time T1 A sets the color of the group to green, any server requests by A, B, or C at time T2 or later will show that the color of the group is green until it is further, later, set to something else. X, Y, and Z, by contrast, will not be affected by this change.)
Add two threads so that your agent has an agent communication channel (same as the admin channel in the joke server), and a Sleep Looper . In this way your agent can be active and "live" even when it is otherwise blocked waiting for input from the client (see below). Cause each agent to autonomously wake up periodically and send a message, such as "Agent XXX just woke up to say hello," to all agents in the group, which is then displayed on the console of each agent in the group.
Integrate the animal code by modifying the existing code, or writing your own code from scratch, so that an agent accepts values from, and returns responses to, a Web Browser, using simple HTML forms. In most cases you will have just one client (user) per agent (but see below). [Some of you may end up skipping this step. Note that you can implement the rest of the basic agent system without actually playing the animal game. DO THIS LAST.

Here are the HostServer class files giving the example framework we ran in class, and also a printout of the HostSever Java Code. To run, just download the .class files into a directory and issue "java HostServer" then start your browser. Additionally here is the text version of the Execution Instructions for the HostServer. The client is simply a web browser pointed to localhost:1565 to start. (Yes, the DIA port is 2525.) Here is code to get next available port

In all cases, your server should run as a standalone java application (unless prior arrangement is made), accept initial client interactions from a Web browser at port 2525, and provide Web-based instructions on how to run from that point on. That is, I would like to be able to compile your java source code, run it in a virtual machine, connect several web browser sessions to your server at http://localhost:2525 and get instructions on how to proceed from there.

Discussion:

The basic problem:

We are designing and (partially?) building a Distributed Intelligent Agent system. Thus the following features must be present:

Distribution in the form of processes running on different machines (albeit simulation of this on one machine is fine for our purposes).
Intelligence in the form of learning. Stored on disk. Generally monotonic (that is, it generally increases). Here we learn about animals and questions that discriminate one animal from another, but the content of the game is not important (although the structure of the data, and user interaction IS).
Agents in the sense that this code (a) runs with a unique identifier bound to a state that persists over time, across invocations. (b) can make some more or less autonomous decisions distinct from user input, (c) can communicate with other autonomous agents in ways that affect its decision process, and (d) the agent lives in at most one place at one time (unless otherwise explicitly explained as very special cases). In this case our agents also interact with a user.

This assignment is intended to be fun, but also to exercise our developing expertise with respect to some very real, and very hard, problems in building distributed applications. How do we serialize a dynamic data structure? How do we deal with time in a system with independent clocks? How do we deal with node failure, and in particular with failure of a node that maintains the canonical form of an agent? What decisions must we make regarding the tradeoffs between locking data in critical sections of a social process -- causing possibly unacceptable processing delay -- and losing data because of the multiple-update problem? Do we want to use a distributed transaction model, or something less rigorous? What use can we make of soft data where some data loss is acceptible? How do we migrate an agent from one place to another when there are multiple clients looking for the agent in its current location?

The assignment is not intended to be a burden to students who have constraints on their time, or feel they do not have as strong a programming background as others. If you fall in one of those categories, then design and implement only the base system using the HostServer example, and write a straightforward paper about the rest. But, block out your goals early so that you do not have to worry about it.

Protocols for all communication (such as between agents, between agents and host servers, and between one host server and another) in your system are up to you. The platform you use is up to you. The system design is up to you. The particular set of problems you solve, beyond the basic system, is up to you.

Use the standard checklist that we use for all assignments. Note that you will have to fill out many of the features of your system yourself by adding table entries at the bottom.

Because agents MUST communicate with other agents (in the simple case, by sending messages regarding the current color of the group; in the more complex case by autonomously migrating without instigation by the user; in the best case, by sharing animal data updates, and node locks, etc., in real time) they MUST not simply wait in a blocked state waiting for input from the client. But, fortunately, we have a model for this in the admin channel of the joke server. Additionally, we want to have some kind of continual processing outside of waiting for input from either the client or an admin request. This can be simulated by a simple sleep-and-wake-up loop, where the agent does something every time it wakes up (see discussion above, and below).

Note that if it is necessary there are a number of ways to store state on the client in a web browser, the easiest of which is to simply store values, possibly hidden values in a dynamically constructed web page. As you will see below, this may not be necessary, because, unlike with the JokeServer, it is possible that the entire state of the agent, and the game, can be stored at the server (that is, as part of the agent) -- see below.

Getting started:

You are free to write your program from the ground up, using mechanisms of your choosing. The following is just one way to get started, if you do not have a better plan.

Get the HostServer program (above) running, by typing it in, manually. This will serve as the host framework for your DIAgents.
Get the animal program running.
Think about integration of the animal program and the HostServer. Make design decisions, on paper, about where you will be storing the state of the animal game for your clients, how the data structure works (you can always create your own data structure -- it is just a tree), how the web interaction will work, etc.
[NOTE: you might want to do this last, and just fake the animal application so you don't get bogged down at this point, then add it later.] Integrate the animal program into the HostServer such that input and output to the program are from the web to an agent. This may be the most challenging step for some of you, so you should start early. It is likely to be true that for every minute you spend on the previous step, you will save ten minutes on this step. (Note: if you cannot implement the full assignment, but want to turn something in, you might skip the animal game altogether and just return a state number to your web clients, as the host server does, and then focus on the other DS aspects of the system.) You might consider writing your own animal code, based on your own implementation of the data tree, rather than using that from the example.
Modify your code so that agents are not simply blocked waiting for input from a client. That is, start an aditional thread that uses a sleep loop (Loop forever; do something like print an incremented number on the console; sleep for five seconds; repeat) to keep the agent "alive" even while it is waiting for client input on a doorbell socket. Add yet one more thread that listens for agent communication connections (e.g., the "administration" channel of the JokeServer).
Add a structure that keeps track of the agent's group and color. Change the HTML that is sent back to clients to reflect the group and color of the agent, and also displays a set of radio buttons allowing users to set the color for the group. Make the first version work so that the user changes just the color for the one agent.
Implement a way for agents to have unique names.
Write the syntax for the agent communication language that you will use, on paper.
Write out the design of how an agent will be inserted into a particular group when it is initiated, and how it knows the addresses of other agents in its group.
Modify the sleep loop so that when an agent wakes up, it sends a hello message to the agent communication channel of all other agents in the group, using the agent communication language you have designed.
Modify the agent communication channel of the agent so that when it receives a "hello" message from another agent, it displays that message on its server console.
Modify the code of the agent so that when a user tells it to change the color of the group, it sends a message to all other agents in the group, using the agent communication language you have designed, telling them to change their color as well.
Modify the code of the agent so that when it receives a message on its agent communication channel, telling it to change its color, it does so.
Modify the HostServer / Agent code so that when agents migrate from one location to another, they include their entire state (location in game of their client, color, group membership, and so forth). It is O.K. for you to instigate this migration from a client, but better if you instigate the migration, autonomously, from within the agent "wake-up" loop.
You now have a basic agent, hosted on a server that hosts agents, is uniquely identified, is "intelligent" in the sense that it learns, migrates from one location to another, and communicates autonomously with other agents.

More Complex DI Agents

Note: None of the following are required. Ordinarily you would pick one or two that are of interest, and implement them, then pick a collection of ideas to write about in your paper. However, if you like implementation, then you can write more code, and less paper, and if do not like programming, or do not have the time for it, then you can stick with the basic implementation and spend significantly more time with the paper.

In general, implementation carries quite a bit of weight because it is hard, and because it demonstrates deep understanding of the ideas. But, just getting the basic system complete already represents a significant effort, so do not feel compelled to write more code if it is not of interest to you.

GET CREDIT: If you attempt some of the following, but do not get all of your attempted extenstions working, note it in your development log, with details, and I will likely give some credit as long I thought you learned something along the way. The point is, I want to give you credit where I can, and you have to help me do this by documenting your work.

The ideal general goal is to use as much of the knowledge recorded by users as possible (that is, the input from one user is used by another user). The sharing of this particular data is complicated. The pedagogical goal is to play with DS concepts in a real, hands-on, way. So, pick the parts that you wish to implement that match your own pedagogical interests. You do not have to implement your entire system, and would, then, ordinarily put the rest of your design in your agents paper.

SAVE DATA: As animals are learned, this knowledge is save to disk (either for an agent, or for a group, or for the entire system -- if data is shared) so that the data is remember across invocations of the agent framework. Note: this requires that you marshall the data into serial format for writing to disk, and unmarshall it back into dynamic data structure format once it is read back in.

XML: XML is becoming strategic, and is used as the external marshalled form of data for infrastructures like Web Services. You might wish to consider XML-compliant communication languages for your agents, host-servers, etc.

PERSISTENCE: Modify your agents so that their state is saved to disk and can be restored, by name, across invocations, according to your agent's identifier. That is, where their user is in the game, and what their relationships are to other agents regarding groups, trust, etc.

SUPER SERVER: Using a super server, start agents when they are needed by a client, and then kill them off when they are not needed. You can pair this with the above extension so that if clients are uniquely identified by state stored in a cookie, then if they quit the game, but later start up again, they will have their particular agent restored to them.

RICHER ACL: design a more robust agent communication language for communicating between agents. Give the full exposition of synchronization and acknowledgements that must take place during handshaking, allowable content, recovery from errors. As part of your ACL, you might wish to have a subset for communication between other entities, such as, e.g., that between an agent that wishes to migrate, and a potential hosting server.

MULTIPLE CILENTS PER AGENT: If you use the HostServer framework as a basis for your assignment, then each agent loop can spawn multiple workers to handle requests from web clients (users). In this way, you can then support multiple clients per agent. Unless this is of specific interest to you, I do not recommend more than ONE client per agent. Here is why this gets very complicated...

If you allow only one client per agent you do not have to maintain the state of the conversation on the client. By connecting to their "personal" agents for the next step in the game, the clients are de facto identifying themselves. There are no other clients. In this way you can keep the entire state of not only the aminal data, but also the position of the client in the tree (that is, where they are in the game), on the server.
If, by contrast, you allow multiple clients per agent, you will have to store the state (i.e., where they are in the animal game) on the client, (and yes, it is true that you know how to do this from the Jokeserver assignment and the HostServer code).
In addition, and most importantly, when an agent migrates, it must temporarily live partly in each of two (or more!) locations at once: all of the clients will be communicating with the original port (and address), until they find out about the new location. An agent migrates only when it informs its client where it is migrating to, and it will do this by sending the new port back as a response to the client. It is at that point that the new port becomes active, and the old port is killed off. But, if there is more than one client for the agent, then it must, temporarily, listen for clients on both the new port (for clients that have submitted requests) and the old port (for clients that have not yet submitted a request). Now, consider what happens if a client never gets around to submitting another request, wherein we are left with an unreferenced agent object; and what happens if the agent migrates again to a third (or fourth, or fifth...) location before all clients have made their requests to the original, first, port, wherein we have to support a theoretically infinite number of simulateous agent locations.
Many solutions exist (but note that none of them can be failure driven because failure from a simple web client means that the browser will just hang (the endless hourglass, or the timeout page-not-found message) --- very ugly indeed!). One solution is to use an agent location service for ALL communications with agents. When an agent migrates, it registers its new location with the service, and all subsequent requests are forwarded there. Another solution would use some version of reference counting, and when all references to the old location of the server are handled (by pointing to the new location in the response to the client) then the old agent-object can be removed.
When an agent is distributed across multiple ports, even temporarily, then the model of data access for the state of that agent, is much more complicated. Especially noteworthy is that multiple processes , not just threads , may now be sharing the state of the agent, because the agent is now being hosted at two different ports, under the umbrella of possibly two different HostServer frameworks. But wait, it gets worse: in the most difficult case, the agent will now be living on two, or more, different machines yet all sharing the one, single, agent state.
For the above reasons, a robust, true, solution to just the one problem of allowing multiple clients for one agent, would comprise a challenging DIAgents assignment.

SECURE CLIENT DATA: Modify your agent/client interaction so that the animal data is treated as private, sensitive, data. How can you protect your client's animal data, especially when an agent migrates from one location to another?

SECURE AGENT COMMUNICATION: How can you implement secure communication between agents (e.g., within one group)? If you are using Public Key Encryption, how is the secret key of an agent migrated along with the agent without compromising security? What sort of trust model is required?

Distributing the data!

An excellent set of modifications to the original simple design allows us to share data between the agents, and to use the data in novel ways. This is, in theory, the idea goal of the agent system. This section suggests some ways that the data can be used (but there are many others, as well). Note that many of these extensions must address what we have called the Lunch Problem, below.

SHARE DATA: Whenever one user enters an animal into the tree, this update is reflected in the knowledge of all other agents, in real time.

LUNCH PROBLEM SOLUTIONS: See below. Implement one or more solutions, or semi-solutions to the lunch problem, when data is shared. (See also the lecture notes.)

NODE LOCKING: Change the data structure to have UUIDs in the nodes. This will allow you to lock portions of the data when synchronizing updates between agents.

CENTRALIZED DATA STORE: Have all agents share one instance of the data. Use a coordinator process to restrict access to the data, treating update requests as a distributed critical section problem.

MULTIPLE TREES: When subtrees cannot be integrated, then save them, and use them as alternative paths after a failure. That is, having guessed "goose" and been wrong, then go back up the tree to a point where there is another subtree available, and start asking those questions. Also, the alternative subtrees can be used with the "alternative" uses of the data given below.

LOSSY DATA: The shared state of the agents is reconciled only periodically, and if a leaf (animal) node has been modified in more than one way, some algorithm is used to select one version of the data, discarding the other. Although not specific to the idea of allowing less than completely consistent data, a highly relevant set of discussions can be found in Tanenbaum section 6.2.2 through 6.2.8, e.g., Weak Consistency .

RICH USE OF THE DATA: One important point, especially for those implementing, or designing, more complex systems, is that the interaction of the user with the knowledge is not limited to the replay of questions and the guessing of animals.

The basic data structure is a binary tree. Internal nodes are questions. Leaf nodes are animals. Presence in the subtree on the left side of a question implies that the content of the question is true for that animal. Presence in the subtree on the right side implies that the content of the question is false for that animal.

Implicit in the tree is the ability to use it for playing the animal game, trying to guess an animal by asking the questions on the path that lead to that animal. But there are other ways to use the knowledge. For example, here are four ways you could use knowledge other than playing the animal game:

Explicit knowledge: You can store you data in such a way that a user might ask, "What do you know about a shark?" and your agent can say "It is bigger than a mailbox," "It has teeth," "It eats meat," etc.
Relative differences: Or, the user might ask "Do you know the difference between a parakeet and a hippopotamus?" for which your agent could look back up its tree, find the divergence point, and say, "Yes, the difference is that a parakeet is smaller than a mailbox, but a hippopotamus is not."
Relative similarities: Or, the user might ask, "Do you know how a shark and a parakeet are similar" to which your system could reply, by following the part of the path from the root node that they share, "Does not have fur," "Does not live on a farm," "Has two eyes," etc.
"Re-try" knowledge: Having reached a leaf node unsuccessfully, the agent could then start at the root of a different tree, before giving up.

Each of these uses requires that the data structure be modified so that, e.g., the agent can work back up the tree.

EXTENDED GROUP MEMBERSHIP: Design your agents so that they can migrate between groups.

GROUP DATA:

Agents in one group share data only with other agents in the group. In this way there are multiple, shared, but globally inconsistent, data stores.
Implement a security model for groups, including a secure ACL, and the secure sharing of learned data.
Build agressive snooping into agents such that they attempt to steal the data from other groups.
Agents are allowed to migrate from one group to another. How does the trust model support this? What happens if security is compromised?

LOCATION SERVICE: Use a central registration server, so that all agents can find all other agents, or you might use, e.g., some kind of UDP broadcast announcement that another agent has joined the game. What happens if your registration server goes down?

COMMUNICATION SIMULATION: You might consider having agents communicate with one another through a central authority for the purpose of allowing for interesting developments with security and the eavesdropping on the conversations of other agents. That is, if a central communication authority is used (artificially) then groups of co-operating agents communicating through this service may have their network traffic "snooped" by other agents wishing to steal data. This can make an interesting simulation, such that agents not part of the group can take advantage of messages being sent within the group.

SECURE COMMUNICATION: Alternatively you might wish to use secure communication, such as with web services security, between agents, showing that NO ONE could snoop.

The Lunch Problem:

This is just one example of many problems that may have to be addressed, depending on the design of your system. For those that choose to maintain a consistent copy of the data for all agents in a particular group, the following issue has to be addressed:

The assumption is that you will lock either your entire tree structure, or at least a node of your tree structure during an update. It is not important whether there is a single, central, data store, or a distributed set of data stores that are kept consistent. The problem is the same in either case.

However, if you think this through, the update includes interaction with the user, and the user might, e.g., go out to lunch in the middle of your locked update critical section, thus effectively shutting down the entire system while they are at lunch, explained as follows:

If you lock the system before asking the user what question distinguishes an X from a Y (new animal) you will get a valid question and animal pair. But, you have the lunch problem --- the user, who has a lock on the X node, may go out to lunch before they submit the question. If you do not lock the tree for update until after the user has SENT the question and the new (Y) animal, node X may have moved down in your tree beneath new question Q, or questions Q1, Q2, ..., during the time between the asking for the input, and the sending of it by your user. If you simply install your data at the NEW leaf location of X you will have to make assumptions, quite possibly invalid, about the relationship of Y to the questions Q.

For the purpose of discussion, let's call this the lunch problem.

Here is another pass at the problem: If we decide to lock only the particular node that the user is updating, what if two users are updating that same node? This helps, but may require that we ask a second question of a user. To wit: User A, and user B have terminated on "goose". No locks are installed until after the users complete the slow process of saying what their animal is (dog, horse), and what the question that discriminates it from a goose is. After queuing the animal/question pair, a lock is sought, and when granted, the update to the "goose" node is made. But both A and B have been asked "is it a goose?" User A's agent gets the lock first, and builds the node (dog "does it bark" goose) in place of just "goose". User B's agent now gets the lock, notices that the goose node has been changed, and has to ask the user the additional question, "Does a horse bark?" and only then can build the final node (dog "does it bark" (horse "does it have shoes" goose)) - and if a horse DID bark we would have still have to ask, "Does a dog have shoes?"]

Questions

Suppose your are using a central data store, we might ask:

What happens if the central repository crashes?
What version of the data is guaranteed if a backup agent/server takes over?
Is it possible that ANY agent can elect itself as the server for the new central repository of the state, such that as long as there is a single DIagent available it can claim ownership of the canonical knowledge store, and conctinue running?
How does your algorithm resolve conflicts? For example, suppose that agents A, B, and C, comprise the group. A maintains the most recent update of the knowledge base, which B has replicated. C has been off line because of process failure and is seriously behind in its knowledge. B loses its network connectivity so does not know whether the other agents are active. But, B continues to collect knew knowledge. A goes off line due to process failure. C comes online and cannot make contact with A, the knowledge server, nor can it find B. C falls back on its own version of the knowledge, which is out of date. C updates its knowledge, so it is now the most time-current. A comes back on line. Which knowledge store takes precedence? The most current (C), or the most extensive (A)? Now suppose that B's network connectivity has been restored -- which of the three knowledge stores is to be chosen? A which is considered the canonical store, and is extensive. B which is extensive, and recently updated, but which has not had the benefit of keeping consistent with the rest of the group, or C which is the most recent, and has been the most recently in contact with the rest of the group (which could have icluded contributions from DIagents, D, E, F,... etc.), but which is not as extensive? See Tanenbaum section 5.4 -- Election Algorithms

Data structures

The existing animal game just uses a simple, dynamic, binary decision tree structure. The same tree, with the same animals, could be very different from another tree with the same animals. And, the questions would be very different as well. Interestingly there will always be one question less than the number of animals, no matter what the structure. (Think about this, for every animal added, a question is added as well...)

Given the structure, it is generally not possible to integrate the knowledge from one tree directly into another tree.

But, this does not mean that no integration can take place. For example, should two or more agents decide to share knowledge:

A larger tree can always be preferred over a smaller tree.
An agent could keep more than one tree. If it guesses an animal, and is wrong, then before asking what the animal is and for the question that discriminates the guess from the new animal, it can start on a different tree. If it guesses the animal in the second tree this is a "win" and it can also simply ask for a question that allows the animal to be added to the orignal tree.
Knowledge can be demonstrated in other ways too. For an animal (a leaf node) we can ask what an agent knows is true of that animal. By working back up to the root node we can know that, e.g, a duck has feathers, is warm blooded, has a beak, is a farm animal, lives on water, and flies. But this knowledge can come from more than one tree. In this way "learning" other trees can be directly seen as increasing knowledge, even if no integration has taken place. That knowledge comes from two or more trees is transparent to the user.
Once we look at knowledge as being something we can produce in the form of true statements about animals, we can also allow a higher level of knowledge in the form of similarity where animals that share more of the same answers to the same questions are seen as more similar. And, importantly, we can say WHY an animal is similar to another. (E.g., because ducks and geese, even though they are different colors, are also farm animals, that live near water, fly, have beaks, are warm blooded, and have feathers, they are similar )
Additionally, when we absorb another knowledge tree, we automatically know more animals , and if we are clever we can find non-burdensome ways to integrate these animals into our canonical tree. "Hey, by the way, is the answer to 'Geese have feathers' true or false? I was just wondering?" Over time, by asking a few extra questions during each game, all of the animals can be integrated.
And, we may already know the answer to some of the questions automatically. If in ANY of our trees "Geese" appears anywhere on the true side of "has feathers" we will be able to move the data down to a lower level in the tree, past that question node, without asking.
Thus, at any node, we may have multiple animals in various stages of integration into a canonical tree. At any agent we may have multiple trees. We may have much that we can say about any particular animal using more than one tree to answer. We may already know answers to questions that will help categorize an animal as soon as we get that far down in the tree.
And, it is also true that we may have questions that we are Looking to answer, that are known by other agents: "Do you happen to have geese in your knowledge base, and if so do they have feathers?" possibly allowing for trading of information, or bargaining for membership in a group.

Bragging Rights:

Assuming you have gotten your DI Agent built, you get to brag anyway, so smile!