« Change always comes bearing gifts | Main | Lets demystify the BICC »

Saturday, January 29, 2011

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Rob_mol

Hi Ronald,
Nice blog. Good idea to compare those schools for implementing Data Vaults.
I'm wondering whether there should'nt be a fourth (also mixed) school with a dimensional staging out.
Another point of discussion is whether we should model every source-table in the form of a DV. DV is specially a good way of modelling (important) masterdata. That does’nt mean IMO that all the tables should be modelled as HUBS or LINKS with underlying SATS.
I am also looking forward for your deliberations on the 'let's generate it all" lobby.

Ronald Damhof

He Rob,

thx for the heads up - I deliberately skipped the staging layer for dimensional output. Technology is racing very fast at the moment and the datamarts architecture/strategy is influenced by it big-time (e.g. virtualisation, in-mem, massive proc power tc.). It deserves a separate blog post! Good point though.

Your second question is extremely valid! More and more I am convinced that the characteristics of the source data determines the way it should be propogated. For example the distinction between master data and transactions/events. Or the distinction between 'already historically staged data in the source' versus 'non already historically staged data in the source'....Again - seperate blog post. This one I had my eyes on for some time now. Again - good point.

Thx for the reply.

DM_Unseen

Hi Guys,

@Rob,

The answer is of course *no*
IMO the biggest friends of a DV modeler are the (histozrized) reference table and the transactional link, Neither of which are primary DV constructs. The real art of DV modeling is is not what you put in your DV (model) but what to leave out (kind of DV-ZEN;)

Ronald Damhof

DV-ZEN ; lol

DM_Unseen

@Ronald,

Back to School:

We *Always* separate the design of our Data Vault and Business Data Vault, because they are (more or less) independent. I classify Data Vaults and Business Data Vaults separately.


Data Vault schools:

School 1: "Source system(s) super ODS"
Raw non-integrated Data Vault

School 1a: "Super ODS"
Raw integrated Data Vault.
This is what we are doing at the RU BTW

School 1b: "Classic DV"
"Business Model Oriented Data Vault". There is a gliding scale between schools 1a and 1b.


Business Data Vault Schools:

Business Rule Vault:
(your option 3) Puts business rules in a DV structure for staging out. This is what we are doing also.

(Full) Business Data Vault.
If there is *no* staging out BR to the data marts I call it a BusVault (Kimball BUS architecture in a Data Vault). If there is some staging out (BR) it's a Full Business Data Vault.


I can (and will) combine any and all to get to the right architecture for the organization (there is no "best" solution IMO, just some sub-optimal choices).
I do have to remark that I consider source or business model orientation a business decision and not a DV architectural one. For me this is not an argument for or against a DV architecture, but one for running the organization.

DM_Unseen

@Ronald,

Surprisingly ZEN-DV is grounded in formal science/model transformation theory. The best proof is the definition and usability of the "Hub Minimalization Rule".

Ronald Damhof

Some issues I have:
First; you seem to separate between upstream architecture and downstream architecture. Which is fine, but it lacks coherency I belief. For example; the Business rule vault as well as the BusVault - as you call it - are hard to combine with school 1 as mentioned by you. Even combining it with school 1a could have additional challenges.
Furthermore; I am not in favor of the term 'Business Vault'. Somehow it surfaced in the last years - it confuses discussions a lot in my opinion.

Second; the term ODS is tripping me off. It is a highly misused term and lots of peeps have a specific association with it. Your 'ODS' does not seem to be coherent with ,for example, Inmon's ODS. Why not call it as it is; a copy of the source where history is maintained. A persistant/historical staging environment might be a better word.

Third; school 1a - raw integrated Data Vault. Just to be curious, but what do you integrate if it’s not the business keys?

Fourth; 'source or business orientation is a business decision'. I do not get this at all. Architetcural choices like this one (which in my opinion is a Design decision) are based on requirements, leading architectural principles and information architecture. The choice for source or business orientation should always be argumented in line with the these requirements, principles and Information architecture. One can argue that ALL design decisions are business decisions, but I have never confronted my business with these detailed choices.

Last but not least; I think it is our job to think about decision trees; general guidelines for determining when to use what school of thought and what its implications are for business as well as IT.
And we just started that discussion – cool ain’t it ;-)

DM_Unseen

Some feedback:


1. I agree they are not fully orthogonal, but that's because you want to put a 'split' somewhere even if you look at it conceptually there s no *real* split (THERE IS NO SPOON!) *unless* you consider a metadata split

I used Business Data Vault (or Business Vault) for anything Data Vault like not directly connected to source systems but YMMV, beter words create better worlds :)

2. I know ODS is misused al lot, and I actually don't care a fig about those idiotic definitions;)
An historized staging (HSA) is usulally in 5NF IMO, so that's why I do not use that name.


3. That's eaxactly what we integrate, but we have almost no busines model/analysis to optimize our DV further towards a classical DV

4. I think I understand your comment, but if business thinks sources need to drive the DV it's *their* descision, not mine. Of course it will become an architectural principle etc. etc. I'f like to live in your world for once, because *my* customers/endusers are confronting *me* with these detailed choices all the time:)

Rob_mol

Some other thoughts:

I think it will help to differentiate between a conceptual model and a physical implementation.

The conceptual model consist IMO of three segments: source data, business model (in terms of keys and attributes of the business-entities) and information model (relevant facts and dimensions). On this conceptual level we have translation and integration rules for mapping the source data on the business model and business rules for enriching business data to information with business value. To construct this conceptual model you start with the business value and work your way back to the source data. On this level you don’t need any DV-school. Maybe some science will help (for instance about business rules methodology and development of ontologies and taxonomies). Conceptual modelling for BI looks to me like a rather uncultivated territory. I am very interested in experiences.

The physical implementation of the conceptual model depends mainly on non-functional requirements (history, traceability, timeliness, adaptability, cost, etc.) and possibilities of the available toolset. I think its good to have an overview of the different ways (schools) to physical implement a data warehouse. Here we need good craftsmanship. The carpenter who knows when to use a hammer and when to use a screwdriver.

DM_Unseen

@Rob,

Interesting comment. At the Radboud University we're thinking along the same lines. We use what we call an 'internal ontology cloud' which is in fact an adapted FCO-IM representation. This will allows us to abstract from almost *anything* (Source systems, business models, business rules etc) and conceptually integrate all our models in one repository. Here we can tie all our models together and generate DV, star schemas, whatever-you like-schemas. The conceptual nature of FCO-IM used together with elementary fact-integration should allow for almost arbitrary models and concepts to relate and integrate. From there we can also generate mappings (note, mappings are not defined as such, but generated from the fact integration).
However, we're still on our first prototype, and a lot of (academic) research and development is still required to get all of this up and running.

Ronald Damhof

@rob; I like the way you think! It is in fact exactly what I am doing at most client sides (although time is limited - and exepnsive- to really take a deepdive) and it is also conform the architectural framework of TOGAF or the EIA of IBM. For example the above schools are a physical instantiation of one type of conceptual model. There are more.....

Another blog, or...

We should just write a book guys...;-)

Elwin Oost

@Ronald- the term Business Data Vault was coined by Dan, so I guess we're stuck with it (although he also calls it EDW+, wonder where that's from ;-)

(Interested in your current thoughts on generation as well)

@Rob: I see two implementations of BI solutions with build-in conceptual BI modeling, so it's not completely virgin territory:

- Kalido: appears to have a rather good conceptual modeling approach, but I haven't tried it myself yet. With their underlying Generic Data Model instead of DV they have of course technically a quite different approach.
- BIReady also has a conceptual modeler, but less powerful. Its data model is closely related to "raw" data vault.
- (Quipu has no conceptual modeler as yet, but they plan to do so)

DM_Unseen

@Elwin,

Kalido, BIReady and Quipu with their Business Models are well known. The issue is that they are by themselves totally business driven, and not source driven nor source oriënted. You have to do that all yourself.

If you like tooling then you want a tool that 'integrates' the business model and source models in a non destructive a consistent way and then creates/generates a 'Business Oriënted' (source auditable) Data Vault where possible, and an EDW+/Business (Rule) Data Vault where required.

Rob_mol

@Elwin @DM_Unseen
Please look out for confusing the conceptual layer and the physical implementation.
Tools like Kalido, BIReady and Quipu have IMO to do with the physical implementation.
Of course there are also (other) tools that can be helpfull for developping and maintaining the models and rules in the conceptual layer.

DM_Unseen

@Rob,

I know. However, I think that using FCO-IM (at the conceptual layer) can directly drive the physical layer as well (and should, because you want minimal manual intervention while travelling&transforming between those layers).

Note that FCO-IM can also play a big role in understanding and executing DV transformations as well.
This makes it a *VERY* interesting (conceptual) modeling technique indeed.

Elwin Oost

@DM_Unseen @Rob
I agree these solutions are no panacea, but to call them either strictly source or business-based is imho (mostly) not doing them justice.

Only Quipu is currently still distinctly source-oriented. They're planning to extend it with a business model layer, but they're not there yet.

Kalido and BIReady both have conceptual modelers completely separated from the source systems (though you can reverse engineer from source if required). I particularly like Kalido's conceptual modeler.

There are indeed many other good (/better) solutions for conceptual modeling, but imho these have yet to be integrated better in existing BI platforms (or sparkling new ones) to have more impact on the BI market.

Ronald Damhof

Reading the latest comment there seems to be a fluent transtition towards two very interesting discussions:
- Bridging and differentiating between Concept, logic and technique
- The level of modeldriven data solutions (the 'lets generate it all lobby').

My 2 cents; both discussions show that - at least in the Netherlands - we are trying (and sometimes succeeding) in elevating the discussions to a higher abstraction- and professional level.

DM_Unseen

@Ronald,

You might be tempted to think that these 2 issues are somehow related ;)

Dan Linstedt

Hi Guys,

Please don't take this the wrong way...

I see Busn Data Vault, Staging Out, and EDW+ as the same thing... regardless of what you call it... Regarding coining of the term, it really doesn't matter to me..

You all know that what I strive to do is seek consistency, so I think: we should start a poll somewhere (perhaps on LinkedIn?) and take a vote on what to call the layer - let everyone decide...

I guess the other question here is, is there truly only one layer? or are there multiple layers?

One other thought from my side of the house, BDV/Staging Out/EDW+ or whatever you want to call it, is a logical name, for (mostly) a subset of tables coming from the Raw DV (the true EDW), where the data is processed through common business rules used by all data marts.

It doesn't have to be a complete replication of all the data, and it doesn't even have to be a separate model...

Remember the certification class? I talked (albeit briefly) about the scale-free architecture, where you can stack one DV model on top of another in a tree-like fashion, using Links to hook them together.

In other words, you can create separate "master data" hubs, and "business driven sats", and "business driven links" on a new "layer" of Data Vault, that is linked (if you want it to be) by higher grain of links back to the raw Data Vault at points it makes sense.

This concept can get messy to manage, so generally I do recommend a "separate" model, and storage area for this data set, hence: business Data Vault...

But I really don't care what you call it, so long as everyone agrees to call it the same thing, and define it with a standard definition.

This would be a great one for the consortium to deliberate on.

Cheers,
Dan L

Ronald Damhof

Dan,

Good post - thx for the feedback. And I don't take it the wrong way. This is why we have blogs! I made a new blogpost regarding the same subject. I have a strong need to discuss terminology and meaning surrounding DV a bit more. So I am going to polarise a bit - I am evil....

And yes - the consortium (or platform) will have its purpose in these discussions. However, they are not a politburo. I think "we", being the DV community, need to come to some kind of agreement on terminology and its definitions.

It is a bit of a mess right now.....

regards - hope the snow was any good!

cialis to buy

Cialis and blood pressure http://www.maxipharmacy.com/ where to puchase cialis online usa.

cialis to buy

canada cialis compare cialis viagra fda approved cialis female viagra cialis fda approval http://www.maxipharmacy.com/.

live tv online free

With us you might be sure to take pleasure from all the advantages involving small loans for quick debt consolidation loans despite your poor credit standing live tv online free the government has more information on school funding of single mothers, including a searchable small business loans and grants tool.

กลูต้า

My spouse and I stumbled over here coming from a different page and thought I may as well check things out. I like what I see so i am just following you. Look forward to exploring your web page again.

facebook login proxy link

Magnificent beat ! I wish to apprentice while you amend your web site, how can i subscribe for a blog site? The account aided me a acceptable deal. I had been a little bit acquainted of this your broadcast provided bright clear idea

The comments to this entry are closed.