You have aptly described how the basic framework data classes, DataAdapter/DataSet, work and are absolutly correct that you can easily track which rows have been changed and take advantage of some of the automatic wiring within these objects. However, there are several boats that you've missed with this.
First, the point of the thread was an inquiry about using cache to reduce the number of round trips to the database. In particular, a Refresh function to update a large data set after changes are made to some of the items.
Second, the use of Mobile Business Objects which is contrary to using DataSets and DataAdapters, etc. as these objects work on collections and data "sets". MBO encapsulate all of the data access logic so that the individual objects manage themselves.
Third, the question is not how do we get the data to/from the data base or recognize what has changed, but what do we do with the changes and how do we refresh our local/client objects with the returned data.
While we have focused on and bantered how caching could be implemented to eliminate round-trips and network hops to retrieve data, we have lost sight on the real issue here. I will refer you to a number of other threads that hash out the subject of refresh given the data portal mechanism as provided by Rocky in CSLA. The issue that has led to these other threads as well as the basis for the original post is the need to be able to update the objects on the client to accurately reflect changes made to the database.
It seems to me that reviewing Patrick's original points may help boil this down:
- Lets say I would like to load a very big customers collection with their photos from the Data Portal via a Web Service.
- Now I change some of the customers on the client and save the changes. The client only sends back the changed customers to the Data Portal.
- Now I would like to refresh the customer collection on the client but I would like the Data Portal to only send changed customers (updated, deleted, inserted) back and not all the customers again (to save bandwidth).
It sounds to me like the question really has to do with refreshing the customer collection on the client. The reason for this is that using the CSLA framework, only those ITEMS that have been changed in some way are sent back to the database via the data portal (#2). This occurs, presumably when the customers collection's ApplyChanges (or whatever name) method is called. The behavior provided by CSLA is very much like what a DataSet does in that it iterates through its items (rows) and executes whatever operation is necessary on each object (if it's new: Insert; removed: Delete; modified: Update). This passes control of the actual operation to the individual objects and leads us to where the problem arises.
When we execute a data portal method, we end up with a second copy of our object in the data portal's return value. As a result, the customers collection hold references to our original, pre-data operation objects and somehow needs to be updated with the copies that were returned from the data portal.
As I mentioned, there are a number of other discussions on this topic that you may refer to for more discussion on this point. But, if you narrow the scope of this question down to what it is really about, we can focus on this same issue as the cause of our concerns.
There are many solutions proposed including Rocky's implementation where the collection should be updated to refer to the new object rather than the old - which is handled by the object itself once the new object is received from the data portal. However, our solution was to implement a protected virtual MergeWith method in our business objects that accepts an object of the same type as the owner. Using this method, we can "copy" whatever properties we need to from the return object into our original object and preserve all references to that object. We call this method from our data portal methods and pass the returned object as an argument. This approach has worked well for us.
A couple of additional thoughts. This only really matters if there is some new or different information contained in the returned object. This might be the value of an identity column, timestamp or something set as a result of logic in the stored procedure being used for the operation. If this is not the case, then the original collection is up-to-date and doesn't need to be refreshed.
The subject of caching, which has shrouded the real issue here, is still valid and can still be used to reduce the amount of network traffic required for the application. We implement caching on our fetch methods to optionally retrieve the requested item(s) from the cache rather than requiring a round-trip to the database. We automatically update the cached copy of the object(s) whenever we apply an Insert, Update or Delete operation. As a result, our cached copy will be concurrent with the database - which is the goal.
Finally, it is my personal opinion that having a SELECT * statement at the end of your Insert/Update procedure is adding undo network traffic rather than reducing it. If you have a large table or a large view with many columns this will be returning a lot of data that you already have. My suggestion is to limit this statement to only return columns that have been affected by the procedure. In our case, we have a datetime field for concurrency checks that is automatically updated by the sproc everytime a record is updated. This is the only value that is returned (by default) from our Insert & Update procedures. Our base class implementation of MergeWith(...) copies the returned value to the local object so it remain concurrent with the database.
You can certainly do it the way that tymberwyld has described, but as is said above, you will have to deal with DataAdapters, etc. for this approach.
I certainly hope that clears things up for everyone and helps in some way.