Success of a project and its data model is determined by multiple factors, and one of the critical components of a project is collection of requirements and data flow modeling.
Data flow modeling starts with collecting requirements for the New system and documenting the Current system’s data flows.
Current physical model is documented first and then converted into the logical model. Then the new system’s logical model is developed based on the requirements and the existing logical data flow.
We can break a project in four stages and review how data modelling fits there.
In early stages functional and non functional requirements are identified, and the business process is modeled with data flow diagrams. At this time, the data structures involved are also identified.
- During the first stage the Current physical model is identified. Physical here means how it appears in the real world (vs Logical describing what the system actually does).
- Next we document convert Current physical model to Current logical model. The logical models of the Current system are produced based on the DFDs (data flow diagrams).
- Then in the next stage, DFDs and Logical Data Structures of the Required system are produced. During this stage the difference between the Current set of DFDs and LDSs, and the Required set becomes clear.
- At the next stage we produce Required system’s physical diagram.
Data Flow modeling
We start with Data Flow modeling, in order to illustrate the system’s operation. Having documented the data flow as diagrams we will gain an imporatnt advantage: it’s easier to deal with diagrams than the textual descriptions of the functions, especially when dealing with the large systems.
Typically we develop three levels of the DFDs.
Context diagrams (aka Level 0)
This level of DFDs illustrates top-level view of data flow.
Level 1 diagram
Level 1 diagrams illustrate main functions of the system. We would usually limit the number of functions to 2-3 for a simple system and 6-8 functions for a complex system. The main goal is to keep the model manageable and easy to present on the screen for an audience.
We need to pay attention here to how well we understand the functions.
- Each function needs to have input and output
- Each function needs to have all the data it needs to produce an output
- Gaps need to be documented (what data is missing and where to source it from)
Level 2 DFD
This level introduces the next level of granularity: at this level we describe each function’s processes. Each function ends up with its own diagram.
Each process is an activity performed by the function and L2 DFD description may look like pseudo-code.
Again, we have to make sure that each process has all the data it needs to produce an output:
- Each process needs to have input and output
- Each process needs to have all the data it needs to produce an output
- Gaps need to be documented (what data is missing and where to source it from)
In the process of documenting DFDs it will become clear where we have duplicate data stores, redundant processes, ambiguous terms used by different business groups and we should try to come up with a common vocabulary – if we are to achieve understanding of what we are building.
We pay close attention to “temporary” data stores or caches, where an actor may store data for a short period of time before it flows downstream – typically these are hidden as they are a second nature to an actor and he/she forgets to mention them.
Logical DFD
At this point we have enough information to convert L2 DFDs to a logical data flow model, based on the data stores documented for each function.
Entity Relationship Diagrams (ERD)
Now we can clear view of the system’s functions and data stores. We can proceed to create ERDs and review the Data Model of the system.
This is an important and complicated step – it defines how the information is stored in the system.
Here we document business entities, their attributes, relationships (one to one, one to many, many to many), constraints and create data vocabulary.
Non-functional requirements
Functional requirements describe features of the system, i.e. what the system should do or provide for the end-users: it may list out description of each function, report, GUI element, query etc.
Non functional requirements on the other hand describe control mechanisms, restrictions of the system, security, acceptable performance levels, quality of service, capacity planning details, backup mechanisms and frequency, recovery methods and times.
And finally we are in position to produce a physical model of the Required system.