sql - one table or many for many different but interacting events? -


i'm creating application core function tracks various data on time (blood glucose levels, insulin dosages, food intake etc) , i'm trying decide how best organize information in database.

at basic within particular umbrella event, thought of having single events table fields properties might come up. may unwieldy, though, because vast majority of fields end being blank many of entires; i'm not sure if that's problem. benefit of way easier call , display events. since many of events have 'timestamp' in common, question whether belong on same table.

i'm not sure makes sense have table every kind of event, because taken separately of events have 1 property other timestamp, , have co-mingle. (many types of data not come in group)

some types of events have durations. comparatively rare. 1 class of events rate stays same unless rate altered or temporary override (those ones i'm worried about). simple binary tags (which planning on having linking table for, make easy need/prefer overall event_id link them with.

my inclination is best have few tables closely related types of information rather 1 table , whole lot of space.. not quite sure how proceed.

i love advice on strategy determining best approach in situation this.

edit: here rundown of data types i'm dealing in case makes things more clear

events: -blood glucose       timestamp      value       (tagged w/: pump, manually entered      [pre-meal, post-meal (breakfast, lunch, dinner) before bed, fasting, hypo, high, hyper  - either manually entered or inferred based on settings or other user entries], before/after exercise etc imagine better off dynamically generated queries necessary. though apply same paradigm meals?  -sensor glucose (must separate bc not reliable different number regular bg test, unlikely used majority of users.)      timestamp      amount  -bolus       (timestamp)      bolus total      food total      correction total       active insulin**      bolus type - normal[vast majority] square wave or dual wave  -food      (timestamp)      carb amount      carb type (by weight or exchanges) <- in user settings table      food-description      carb-estimated (binary)       meal? - or separate table.      (accompanying bolus id? though seems finicky)  -meals      timestamp      mealname (breakfast, lunch, supper) (or mealnames table? seems excessive?)  -basal      timestamp      rate per hour      rate changes throughout day on regular pattern, either automatically fill in 'last activated pattern' (in form midnight: 0.7/hr, 7am: 0.9/hr, 12pm: 0.8/hr etc)      create new pattern whenever 1 used  -temp basal      (regular basal pattern can overridden temporary basal)      temp basal start      ?temp basal end and/or temp basal duration      temp basal amount      temp basal type -> either in % or specific rate.  -exercise      start-time      end-time      intensity      ?description (unless 'notes' universal event)  -pump rewind (every 3 days or so)      -time  -pump prime      -amount      -type (fixed or manual)  -pump suspended      start-time      end-time  -keytones      time      result  -starred      event  -flagged      event  -notes      timestamp      (user can place note event provide details or comments, might want note there no data well.)  (i want way users flag specific events indicate result of error or otherwise suspect, , star events noteworthy either discuss doctor or @ later)  **only place active insulin when bolus entered, useful other times tracked variable, calculated looking @ boluses delivered x time ago x active insulin time.  other infrequent events (likely 2-10 per year): -hba1c       time      value -weight      time      value      units -cholesterol      time      value -blood pressure      time      value  -pump settings (will need track settings changes, should able queries)      -timestamp      -bg-target      -active insulin time      -carb ratios (changes throughout day basal)      -sensitivity      -active insulin time 

concerns. 1) overarching 'events' table type, bring events in period of time without having query every single table? (disadvantage how work events duration? have optional end-time on event table?)

2) local database 1 user, , there never need compare or interact of records of other users if synced online, thinking of keeping 1 version of database per user, though maybe adding 'user' id uploaded.

3) many of events go ease of interpretation , analysis (blood sugar, meal, food, bolus, notes instance), gather it's better after fact queries rather hardcoding maintain integrity.

some info on database used for: -a visual representation of data types on course of day -average test results , percent of insulin used food, correction, basal. -as specific advanced queries like: list 20 examples of difference in glucose level between before bed glucose , morning glucose when no food eaten , no exercise w/in 2 hours of bed, since settings last changed, etc. -program automatically assign tags based on parameters. if >20 carbs eaten during assigned 'lunch' period, food lunch. if there 2 food intakes within 30 minutes (or 'meal length' preference) group them 1 meal.. not totally sure how function right now.

v1.0

relational databases, , sql (which designed them) perform better when data organised , normalised. 1 big table un-normalised, , crippled, in terms of both performance , relational power.

your requirement calls ordinary supertype-subtype cluster of tables. unfortunately ordinary relational structures such not "common".

  • the standard subtype symbol semicircle.

    • the cardinality of supertype::subtype 1::0-to-1.

    • the subtype primary key supertype primary key. foreign key supertype.

  • there 2 types:

    • exclusive, there 1 subtype each supertype row, denoted x through semicircle.

    • non-exclusive, there more 1 subtype per supertype row

  • yours exclusive. type needs discriminator, identify subtype active supertype row. number of subtypes small, indicators can used; otherwise classification table required.

  • note this, structures, rules, constraints, required support it, , provide data integrity, available in ordinary iec/iso/ansi sql. (the non-sqls not comply sql requirement).

data

  1. naming important. advised name table row, not content or meaning or action. speak of events, can see readings.

  2. there must context these readings or events. not see how eventid hangs in air. have assumed readings particular patient. please advise, , change model.

  3. composite or compound keys normal. sql quite capable (the non-sqls aren't). patientid exists fk in reading, , used form pk. there no need additional readingid column , additional index, 100% redundant.

  4. sql quite capable of handling many tables (the database working on exceeds 500 tables), , large numbers of smaller tables nature of relational databases.

  5. this pure fifth normal form (no columns duplicated; no update anomalies).

    • this can further normalised sixth normal form, , further benefits can gained; , 6nf can optimised, etc.; not required here.

    • some tables happen in 6nf, consequence, not intent, cannot declared such.
      .

  6. if provide information limits , overrides concern you, can provide model resolves issues.

  7. since data is modelled, set fast comparisons (generating alarms, etc.).

▶reading data model◀

readers unfamiliar standard modelling relational databases may find ▶idef1x notational◀ useful.

feel free ask clarifying questions, either comments, or edits question.

caveat

  1. the oo , orm crowd (lead fowler , ambler) clueless relational technology , databases. designing objects quite different modelling data. if apply object design databases, end monstrosities need "re-factoring", , have buy yet "book" shows how efficiently. in meantime "database" crippled.

  2. relational databases modelled correctly (as data, not objects) never need "re-factoring". in highly normalised databases, can add tables, columns , functions without having change existing data or code.

  3. even concept of orm totally flawed. data has more permanence objects. if model data first, model objects data, stable. if model objects first (which weird anyway, without understanding of data), model data after objects, going , forth, correctly both.

  4. relational databases have had ordinary structures such supertype-subtype on 30 years, , work well, if implemented that. not "gen-spec" or "class-inheritance" or such oo thing; , if oo or orm structures implemented, without modelling data correctly, "database" crippled, , need "we-factoring".

    • additionally, not implement required data integrity constraints, data quality poor. not allow bad data enter database; "databases" full of bad data, , need "book" on how wash dirty data.
      .
  5. they have sequence, , hierarchy, mixed up. done correctly, there no "impedance mismatch", no pseudo-technical names mask pure stupidity; justify doing same set of work on , on , on again.

so run hell using oo or orm terminology when dealing relational databases.

v1.1

your edit provides far more detail, of course demanded, because context, whole, necessary, if data modelled correctly. incorporates info. however, questions remain, , back-and-forth required before can complete. feel free ask questions not absolutely clear; not sure gap until throw up, , speak it.

▶event data model v1.1◀

  1. all models pure relational (retain full relational power), idef1x compliant , fifth normal form (no update anomalies). rules (business or data/referential integrity) drawn in model can implemented declaration in iso/iec/ansi sql.

  2. never hard-code anything. models not require that, , code working database not have that. fixed text normalised reference or look-up tables. (that bit incomplete,; need fill in gaps).

    • a short alphabetic code far better enumeration; once used it, values , meanings become recognisable.

    • because pks, , therefore stable, can safely code:

      ... eventtypecode = "p"
      or
      ... eventtypecode "t%"

  3. i believe datatypes self-evident or can worked out easily. if not, please ask.

  4. everything note "finicky" valid. issue is, since have not had database engage with, did not know should in database vs should or can sql code. therefore "finicky" items have been provided (the database elements), need construct code. again, if there gap please ask.

    • what saying is, working in traditional style of i data modeller, developer, have ensure every item perspective delivered, rather relying on me interpret notes. delivering database supports requirements that can glean notes.
      .
  5. one patient per database. let's allow possibility system successful, in future, have 1 central workhorse database, rather limiting 1 database per patient, nightmare administer. let's need keep patient details in 1 place, 1 version of truth. have provided. not limit in short term, implementing 1 db per patient; there no problem @ 1 row in patient table.

    • alternately, can strip patientid out of al tables, , when grow central database configuration, require major database upgrade.

    • likewise, if have sensors or pumps need track, please identify attributes. sensor or pump attributes normalised tables. if "one per patient" that's fine, there 1 row in tables, unless need store history of sensors or pumps.

  6. in v1.0 subtypes exclusive. non-exclusive. means tracking chronology of events, without duplication; , single event may consist of more 1 subtype. eg. notes can inserted event.

    • before completion, eventtype list provided needs filed out in form of grid, showing (a) permitted (b) mandatory subtypes per eventtype. thate implemented check constraints in event.
      .
  7. naming important. using iso standard 11179 (guidelines , principles) plus own conventions. reading type events prefixed such. feel free suggest changes.

  8. units. traditionally, use either metric xor imperial across database, allow entry in whatever user likes, , convert before storage. if need mixture, @ least should have unittype specified @ patient or pump level, rather allowing storage of either unittype. if need either unittype stored, changing , forth, yes, need store unittype each such value.

  9. temporal database. have times series being recorded, , interpreted via sql. big subject, read on it. minimum ask read , understand is:

    ▶temporal database performance (0nf vs 5nf)◀

    ▶classic 5nf temporal database◀ (inspect data model carefully)

  10. basically issue boils down this:

    • either have true 5nf database, no data duplication, no update anomalies.

      • that means, continuous time series, startdatetime recorded. enddtatetime derived startdatetime of next row, not stored. eg. event continuos chronology; eventtype identifies whether event specific datetime or period/duration.

      • enddatetime stored disjoint periods, there legitimate gaps between periods; in case identified via eventtype. eg. exercise, pumpsuspended. (incidentally, suggesting patient knows actual, opposed planned, attributes, @ end of exercise period.)

      • since there no enddatetime, startdatetime datetime. eg. eventdtm

      • this requires use of ordinary sql subqueries. quite simple once coder has grasp on subject. don't, have supplied full tutorial on subqueries in general, , using them in temporal context in particular, in:

      ▶it easy when know how◀. not coincidentally, re same classic 5nf temporal database above.

    • xor have database enddatetime stored (100% duplication) every startdatetime column, , can use flat, slow queries. lots of manipulating large result sets group bys, instead of small result sets. massive data duplication , update anomalies have been introduced, reducing database flat file, supply needs of coders limited ability (certainly not "ease of coding").

    • therefore, consider , choose, long term only, because affects every code segment accessing temporal data. not want re-write halfway down track when realise maintaining update anomalies worse writing subqueries.

      • of course, provide explicit requirements support 5nf temporal database, correct datatypes, etc., support identified requirements.

      • further, if choose 0nf, provide fields, data model complete purpose.

      • in either case, need work out sql code required given query.

  11. datatype handling important. not store time (hours, etc) integer or offset. store time or datetime datatype. if offset, store time since midnight. allow unrestricted sql, , date arithmetic functions.

  12. task you. go through model carefully, , ensure that:

    • every non-key attribute has 1::1 relationship primary key

    • and not have relationship other pk (in other table)

    and of course, check model , provide feedback.

question

given above explanations , guidance.

  • what readingbasaltemperature.type, list values please ?

  • what hba1c ?

  • what keytone ?

  • do need (ie. duration/period enddatetime`):

    • readingbasaltemperatureend
    • readingbolusend
    • basal pattern
    • basaltemp pattern
    • actually, pattern, , how derived/compared ?
  • how basaltemperatureend (or duration) determined

  • starting position is, there no need store active insulin duration. need define how enddatetime determined. based on that, if cannot derived, , or based on many factors or changes time, storing enddatetime might good.

  • the pump settings need clarification.

v1.2

ok, have incorporated information have proved in question , comments. here progressed data model.

▶event data model v1.2◀

there still issues resolved.

  • use percentage or rate only, not both additional indicator. 1 can derived other. using rate consistently.

  • ... worry approach many days basal rate identical.. hence redundancy

    • that not "redundancy". storage of time series of facts, happen unchanging. queries required straight-forward.

    • however, in advanced use, yes, can avoid storing unchanged fact, , instead extend duration include new time interval.

  • i still not clear re explanation of basal temp. please study new model. first, patterns stored separately. second, recording basal temp start rate. need basal temp end (with rate) ?

  • "glucoseeventtype able have more 1 value per glucose result" needs more definition. don't worry id keys. tell me data. each readingglucoseblood, name result values, , glucoseeventtype apply to; mandatory , optional.

  • pumphistory.insulinenddatetime ending instant duration. of course generic, starting instant whatever row compare to. should seconds or minutes since midnight 01 jan 1900.

  • check new event pk. incoming record identifies several events, need parse that, , insert each event-eventsubtype row, using same datetime.

  • except patient, there no id keys in database, none required far. refer parent full pk.

05 feb 11

no feedback received re v1.2.

a lot of data i'm getting being pulled external (and disorganized) csv groups event types under 1 row , has events on same second, granular gets

that easy overcome. however, means an instant not instant. now, walk through whole exercise, bottom line simple.

  • if need it, add sequenceno pk, make unique. suspect eventtypecode enough (there not more 1 eventtype per second). if not, let me know, , change moel.

  • retain meaning of instant instant, , avoid departing architectural requirements of temporal databases.

  • use eventtype afford uniqueness datetime pk.

    • keep in mind eventtypecode deployed in event pk, not discriminator requirement, afford uniqueness. presence in pk of subtypes artefact, not of discriminator (which known virtue of subtype).
  • however there unnecessary complexity due non-exclusive subtype (there can more 1 subtype per supertype row).

  • therefore have changed exclusive subtype, deterministic. 1 eventtype per supertype row; max 1 subtype.

refer implementing referential integrity subtypes specific information re constraints, etc.

the change data model small warrant release. have updated v1.2 data model.

06 mar 11

due upholding "above all, technically honest" guideline in faq, , confronting misinformation requested therein, suspended efforts (which means no longer correct misinformation on so, , such posters have protected reign). interaction seeker carried on, completion, , final data model completed, away so. progression therefore lost readers. however, may of value post result, ▶final data model v1.16◀.

  • events have starting instant (event.datetime).
  • events may durations, in case ending instant (event) required.
  • some events consist of supertype; others require subtype. identified in third column of eventtype exposition.
  • the fourth column identifies type of event:
    • instant or duration
    • duration: conjunct or disjunct
  • note resolution of datetime on seeker's platform 1 second, , many events may occur in 1 second, not more 1 of same eventtype. eventtypecode has therefore been included in event primary key implement rule. artefact, not generic requirement supertype-subtype structure or exclusive/non-exclusive subtypes.
  • intended printing on 2 facing letter pages, enlarged or not, gusset.

Comments

Popular posts from this blog

java - SNMP4J General Variable Binding Error -

windows - Python Service Installation - "Could not find PythonClass entry" -

Determine if a XmlNode is empty or null in C#? -