t e m p o r a l 
 d o o r w a y 

Building Better Data Implementations With C++ Builder

 

Introduction

C++ Builder offers many tools for organizing database applications. Some of these are unique, and using them effectively requires an understanding that, unfortunately, usually takes several implementations to develop. Trial and error to find what works and what doesn't can leave your system littered with examples of poor component use, naming, and programming logic that you won't have time to clean up later. This article will help you jumpstart development with good data implementation practices and skip the mess.
To fully understand this article you should know C++ Builder and database basics, know about data modules in general terms, and be familiar with TTable, TQuery, TDatabase and TField objects.

Goals

Goals of any good data implementation include:

  • Keeping the implementation as similar as possible to the database design - to make it easier to verify the implementation and to simplify development and maintenance.

  • Grouping related data components to help developers reuse the implementation.

  • Organizing the implementation for future extension / enhancement to allow adaptation to changing requirements.

  • Making the implementation DBMS (Database Management System) independent, to allow the application to scale between more and less powerful database systems, depending on the needs of users, clients, or customers.

  • Keeping the implementation independent of the user interface so it can be used by other forms and applications.

  • Making names clear and direct so program logic will be readable and comprehensible - aiding maintenance and enhancement.

  • Creating an efficient data implementation to ensure good performance at all scales.

What's Available To Help

C++ Builder offers many tools and components to attain these goals, including:

  • Data modules - (TDataModule) the form class that is the typical home of data components.

  • Data components (TTable, TQuery, TDataSource, TDataSet) - providing DBMS independent views (cursors) into database tables.

  • Data aware user interface components - linked to a TDataSource, they are outside the data module, and the data module is independent of them.

  • Event handlers - BeforePost, AfterPost, BeforeDelete, etc. are the home of programming logic to filter rows, set default values, enforce constraints, and propagate updates.

  • Persistent fields - (TField) data fields, lookup fields, calculated fields display / allow editing of physical and calculated data, and can also have event handling logic trigged by changes to their data values.

  • BDE aliases - DBMS-independent references to database tables.

Structuring The Data Implementation

Whether you use ER (Chen Entity-Relationship) diagrams or Booch notation, the end result is a set of fundamental data objects or entities. Each has attributes (fields, members, properties), and the values for those fields are drawn from domains [Footnote 1]. In an implementation under C++ Builder, a data module stands for the entity and receives its name from the entity (for instance, "Account"). Each data module should have a primary table ("Table") [Footnote 2], a and a lookup table ("LookupTable") (for lookup fields and other lookup needs from within or outside the data module). It also needs a table for each domain (such as "Type" or "Status"), and perhaps a lookup table for each domain table (TypeLookup", "StatusLookup")

Data Module, Fields Editor, Lookup Field In Object Inspector

The primary and domain tables are the ones which get a data source and which are typically linked to the user interface for editing. Thus, they typically also have event handlers (AfterPost, BeforeDelete) to support referential integrity. Lookup tables may be used in the user interface, but for display or selection only.

Note how neatly this represents a translation of the database design:

Database Design

C++ Builder Implementation

The data module acts as a boundary for the implementation. It takes a special act to bring things across that boundary, which gives you the opportunity to ask - should this be brought in from another data module, or should it be in this data module?

One definite no-no is to access the user interface (i.e. a control) from the data module. If you are even tempted to do this, create your own data aware control instead. Referencing a user interface element from a data module essentially makes it impossible to reuse the data module with any other user interface. That's why data aware controls are so useful.

Notice also how a well-designed structure with good clean names aids in having a clean notation - one can now refer to Account->Type->FieldbyName("DESCRIPTION")->AsString; or Account->Table->First(); This helps keep you thinking of data components in an object-oriented fashion.

DBMS-Independence And Why You Need It

In the bad-old-days, the application knew the physical layout of the data on disk. A little later, the first level of independence was born when all developers needed to know was the layout of the file. Next came the database schema and the DBMS (Database Management System), which let the developer avoid knowing the file layout of the data, and which provided SQL and / or an API to access the database content.

Under C++ Builder, there are two more layers between the developer and the data. The first, the BDE, provides an alias (a name which identifies a database by DBMS and location) to the C++ Builder program. The BDE is the only part of the system which typically knows the actual location and DBMS of a database. The alias is then used in and hidden by the TTable and TQuery components (as DatabaseName), so that the C++ Builder program does not even need to know the alias - only the name of the TTable or TQuery component.

Relationship Between Table / Query, BDE, Driver and DBMS database

The BDE also offers an amazing capability - the heterogeneous join - which allows a SELECT statement to join tables across DBMS - for instance, a dBase table can be joined to a Sybase table, simply by properly specifying the alias - which allows you to work with legacy applications or foreign databases as if they were a natural part of your system.

Why is all this abstraction good for the developer? Simple - if done right, a C++ Builder program can scale from a simple desktop DBMS like dBase or Paradox all the way to large client / server databases like Sybase and Oracle (which can support thousands of users) without any programming changes.

Attaining this independence requires some simple discipline on your part.

  • Whatever DBMS you use, keep table and field names at the lowest common denominator - 8.3 uppercase table / database-file names, and 15 character upper case field names without spaces, numbers, or special characters are probably safe.

  • Establish separate BDE aliases for any tables which you may wish to have reside in separate locations - for instance, on the client vs. on the server. This gives you flexibility in distributing your database.

  • Avoid references to database or table names in your code. Keep those names restricted to data components in the data module. If you must reference them in your code, use notation like Table->DatabaseName rather than "MyAlias". For maximum independence, place a TDatabase object in each data module for each alias, and instead of having tables refer to the alias, have them refer to the TDatabase object. A change to each of these TDatabase objects automatically alters the database used by the tables and queries.

  • Use persistent fields. Persistent fields are, of course, useful for lookup and calculated fields, but you can also use them in your code to avoid references like Account->Status->FieldByName("DESCRIPTION"), which might be vulnerable to a DBMS change. You can instead refer to Account->Status->StatusDescription->AsString; (Note that even if you rename the field object from its default, it must have a name unique in the data module. Otherwise we could refer to Account->Status->Description->AsString. Prepending the table name is probably the least objectionable way to deal with this problem).

There are other important factors in the use of persistent fields. First, it is independent of the sequence of fields in the table, so reordering those fields has no effect on your use of the fields (though renaming and deleting does). Second, you can control the display of a table field in a grid or elsewhere by setting the Visible property of the field (which controls if it is shown in a grid), its EditMask, its Alignment, its DisplayWidth, and its DisplayLabel. Third, you can attach events to the field to deal with new or changed content (in such handlers as OnSetText and OnValidate). Fourth, references through persistent fields are very efficient compared to FieldByName, since a lookup of the field name and determination of its status and position in the row does not need to be done on each reference to the field.

What's Left?

With this strategy, you have a very clear data implementation which is separated from the user interface and from other data modules, highly reusable, tightly related to the database design, efficient, and which provides clean notation for your program code. What more could you want? Well, don't forget, to retarget this data implementation to a new DBMS, all you should need to do is...

  • Create your tables in the new DBMS, using the same table and field names.

  • Retarget the alias to the new DBMS driver and database location.

  • Run your program.

If, for some reason, you are forced to change table names, you only have to change them in the referencing objects in the data module (tables and queries). Field names only have to be changed in the persistent field objects and in SQL.

Finally, you can use form inheritance to create variations based on a core data module. Add some fields for special applications in a descendant of the data module drawn from the object repository. Or add tables to it. Or override or supplement Table, Query or Field event handlers. The possibilities are endless, and largely unexplored. Try something new.

Special Note For dBase, SQL Server and Other Databases With Table Prefixes and Suffixes

If you use TTable, you are open to a particularly annoying problem when you try to port to a new DBMS. This problem stems from the nature of table names in TTable.

  • dBase - table.dbf
  • SQL Server - dbo.table

This can be very annoying, because it will not be enough to retarget the TDatabase object or the alias - you will have to arrange to change the table names as well. If you want the program to be able to use both DBMS (for instance, you have a product that targets more than one DBMS depending on installation size). If this is the case for you, you have two options - either convert to TQuery (recommended) or convert the table names at run time (potentially annoying, but possible).

String __fastcall ReplaceTableInLookupFieldsEntry(String theEntry)
{
   TStringList   *Parsed = new TStringList;

   String ResultText = "";

   try
   {
      Parsed->Text = StringReplace(theEntry,";","\n",TReplaceFlags()<<rfReplaceAll<<rfIgnoreCase);


      Parsed->Strings[2] = "dbo." + WithoutExtension(Parsed->Strings[2]);

      for (int Index = 0; Index < Parsed->Count; Index++)
      {
         if (Index > 0) ResultText = ResultText + ";";
         ResultText = ResultText + Parsed->Strings[Index];
      };
   }
   __finally
   {
      delete Parsed;
   };

   return ResultText;
}

void PACKAGE __fastcall RetargetTablesToSQLServer(TComponent *theDataModuleOrForm)
{
   for (int Index = 0; Index < theDataModuleOrForm->ComponentCount; Index++) // Not queries, bins, or date range tables
   {
      TDataSet *DataSet = dynamic_cast<TDataSet *>(theDataModuleOrForm->Components[Index]);

      if (DataSet != NULL)
      {
         TTable *Table = dynamic_cast<TTable *>(DataSet);

         if (Table != NULL)
         {
            Table->TableName = "dbo." + StringReplace(WithoutExtension(Table->TableName),"Order","Orders",TReplaceFlags()<<rfReplaceAll<<rfIgnoreCase);
            Table->IndexFieldNames = Table->IndexName;

            TwwTable *wwTable = dynamic_cast<TwwTable *>(Table);
            
            if (wwTable != NULL)
            {
               for (int LookupFieldsIndex = 0; LookupFieldsIndex < wwTable->LookupFields->Count; LookupFieldsIndex++)
               {
                  wwTable->LookupFields->Strings[LookupFieldsIndex] = ReplaceTableInLookupFieldsEntry(wwTable->LookupFields->Strings[LookupFieldsIndex]);
               };
            };
         }
         else
         {
            TQuery *Query = dynamic_cast<TQuery *>(DataSet);

            if (Query != NULL)
            {
               Query->SQL->Text = StringReplace(Query->SQL->Text,":PVMapping:Order",":PVMapping:Orders",TReplaceFlags()<<rfReplaceAll);
               Query->SQL->Text = StringReplace(Query->SQL->Text,"\"Order\"","Orders",TReplaceFlags()<<rfReplaceAll<<rfIgnoreCase);
               Query->SQL->Text = StringReplace(Query->SQL->Text,"\"DATE\"","DATE",TReplaceFlags()<<rfReplaceAll<<rfIgnoreCase);
               Query->SQL->Text = StringReplace(Query->SQL->Text,"Orderss","Orders",TReplaceFlags()<<rfReplaceAll<<rfIgnoreCase); // Just in case the prior replace made a replacement it shouldn't
            };
         };
      }
      else
      {
         TField *Field = dynamic_cast<TField *>(theDataModuleOrForm->Components[Index]);

         if (Field != NULL)
         {
            if (Field->FieldName == "DESC") Field->FieldName = "DESCR";
            if (Field->LookupResultField == "DESC") Field->LookupResultField = "DESCR";
         };
      };
   };
}

These library functions convert the tables for a data module or form. The only implementation not shown is "WithoutExtension" which simply lops off the last 4 chars of the string, if the first character of that section is "."

Footnotes

1) A domain can be a continuous one (like the domain of integers) or a set (like the set of valid status codes). In the discussion of domains in this article, the second meaning is intended.

2) For some implementations, it is desirable to use TQuery rather than TTable. Either choice is reasonable.

Copyright © 2004 by Mark Cashman (unless otherwise indicated), All Rights Reserved