« Data: consistency vs. availability | Main | Curing method-illness in Enterprise Architecture »

Saturday, January 21, 2017

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Ja Karman

I see your are stepping into one of the that ohter worlds in the Information & Communication Technology.

So may be time is ready for a new data-model. Proposal call it oo-lc. It is CDC based but he focus is the Object (document container name it whatever) with the lifecycle. Nothing being changed in the object just needing a unique Object-identifier the date/time of change and the date/time being valid. For double-check incoming data you can have the before and after being complete in the change-data message.
The unique Object Identiefer and those two time indicators with preferable the source origin (traceablity) being propagated to alle logical records/tables being extracted from the object.

The question in this is what data is needed for the business or analytics for a specfic goal.
- The last situation or some ohter moment in time reflecting that temporal status. (Time travelling possible).
- The events in all changes from selected objects. May be in a decicated clustered approach voor some association analyses and/or time series events analyses.
There are several ways to view at that temporal time dimension. Building ABT's classic BI of even operational information.

For the old fashioned BI with OLAP included those needed tables can be build from this. There will be a lot of duplication and denormalizationed data as usual.

This OO-LC is breaking with the rigid culture of data-modelling in either (facts quadrant 1 and context II). It can serve research in far better way as I described the intermediate steps for a ABT.

When there is a valid LC (life-cycle) the time-series (arima) with events for te content is reliable. Finding gaps its signalling issues but the document content can still be valid. That is removing that contracdiction between consistencyand availablity.

Don not try to close something turning right with a right screwing direction (vice versa). Data analytics is having a turn left where ICT is used to turn right. Only change of directions in approved agreed moments is possible.

Ronald

Jaap, Thx for the comment!

Can not really grasp what your saying and that makes it interesting. Tbh, I halted already in the very first sentence; the term 'document container', please define this term as precise as you possibly can on a logical basis (no technical implementations pls).

To me it sounds very ambiguous....:-)

Ja Karman

I see a document container object as the unit containing all related information inside as "the basic unit" of information.

To be honest is a real life experience. So the example could help.
The description of ownership on some property can be very complicated.

By actions of the notary it can be made all official. That notary document is having an complicated ER-relationship with many 0-n's. Expect every possiblity to occur in real data. One thing is certain what is made official once cannot be reverted in an other way than by a new event. Every event of document change is information on his own.

Just see that notary document as an example of the "document container". By now I am seeing some 70 technical different tables (type of information) being extracted. Somehow to be joined later for ease of use in about some 30. The first concern is not how to manage the information inside those documents but the documents as a whole.

Working is this topic I am seeing it as a far more generic approach.
It is a very common way we are processing that in human known ways.

Ja Karman

I see the damhof kwadrants resemblence in the figure "big data referencearchitecture". It is a good blog of xomnia.
http://www.xomnia.com/expertise/big-data-engineering/big-data-architecture/

Now take the middle part, also the conflict area in your kwadrants.
Break it apart after staging in the middel of DV / data lake.
Take the advantages of the DV but remove any business structure and add the control of checking technical input completeness.
By that serving both the old BI (IT-push restricted design) and the more free ABT into the analytics.

Nice even the shortcut Arrow of getting data into analytics is there.

Ronald

Jaap,

Your example is quite confusing, you are talking about a document as a concept. Which is perfectly fine, but what is the context of such a document identified? Time? Parties involved? Object (real estate, ...)? And how is this context identified (since the desire to integrate is high in analytics)? I think you are mixing up (at least) logical and technical concerns in your example (70 'tables' sounds pretty technical) which makes it confusing what you actually wanna accomplish.

If you execute a pure document approach - which is not a bad thing depending on the use case - the context is hidden in the document and often not explicitly stated. This could be ok when documents are somewhat standardized, but they are - by definition (unless it is XBRL type of docs) - not. So I got tons of documents with huge variety and I wanna do some analytics.....good luck on ya.

R.

Ja Karman

Ronald take as an example this one. It is the ER relationship of all kind of information wiht all kind of legal rights events you make offical at the Notary (paperwork - document). https://www1.kadaster.nl/1/imkad/documentatie/20120508/index.htm
XSD and fully highly structured although still complicated.
By nature this specific one will have with every update a full discription and not only partial updates referring to previous ones.
Where is the issue you are seeing?

Combine this https://www.linkedin.com/pulse/big-data-reference-architecture-martijn-imrich?forceNoSplash=true wiht your kwadrant model the middle area is the conflict zone. The others are nicely in the places according your kwadrants. Even the shortcut left upper down right is there.

Ja Karman

Marvelous: We ‘the architects’ should stand firm in stating the message ‘this is complicated, we can do it, but I am not sure how’.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)