sql - one table or many for many different but interacting events? -
i'm creating application core function tracks various data on time (blood glucose levels, insulin dosages, food intake etc) , i'm trying decide how best organize information in database.
at basic within particular umbrella event, thought of having single events table fields properties might come up. may unwieldy, though, because vast majority of fields end being blank many of entires; i'm not sure if that's problem. benefit of way easier call , display events. since many of events have 'timestamp' in common, question whether belong on same table.
i'm not sure makes sense have table every kind of event, because taken separately of events have 1 property other timestamp, , have co-mingle. (many types of data not come in group)
some types of events have durations. comparatively rare. 1 class of events rate stays same unless rate altered or temporary override (those ones i'm worried about). simple binary tags (which planning on having linking table for, make easy need/prefer overall event_id link them with.
my inclination is best have few tables closely related types of information rather 1 table , whole lot of space.. not quite sure how proceed.
i love advice on strategy determining best approach in situation this.
edit: here rundown of data types i'm dealing in case makes things more clear
events: -blood glucose timestamp value (tagged w/: pump, manually entered [pre-meal, post-meal (breakfast, lunch, dinner) before bed, fasting, hypo, high, hyper - either manually entered or inferred based on settings or other user entries], before/after exercise etc imagine better off dynamically generated queries necessary. though apply same paradigm meals? -sensor glucose (must separate bc not reliable different number regular bg test, unlikely used majority of users.) timestamp amount -bolus (timestamp) bolus total food total correction total active insulin** bolus type - normal[vast majority] square wave or dual wave -food (timestamp) carb amount carb type (by weight or exchanges) <- in user settings table food-description carb-estimated (binary) meal? - or separate table. (accompanying bolus id? though seems finicky) -meals timestamp mealname (breakfast, lunch, supper) (or mealnames table? seems excessive?) -basal timestamp rate per hour rate changes throughout day on regular pattern, either automatically fill in 'last activated pattern' (in form midnight: 0.7/hr, 7am: 0.9/hr, 12pm: 0.8/hr etc) create new pattern whenever 1 used -temp basal (regular basal pattern can overridden temporary basal) temp basal start ?temp basal end and/or temp basal duration temp basal amount temp basal type -> either in % or specific rate. -exercise start-time end-time intensity ?description (unless 'notes' universal event) -pump rewind (every 3 days or so) -time -pump prime -amount -type (fixed or manual) -pump suspended start-time end-time -keytones time result -starred event -flagged event -notes timestamp (user can place note event provide details or comments, might want note there no data well.) (i want way users flag specific events indicate result of error or otherwise suspect, , star events noteworthy either discuss doctor or @ later) **only place active insulin when bolus entered, useful other times tracked variable, calculated looking @ boluses delivered x time ago x active insulin time. other infrequent events (likely 2-10 per year): -hba1c time value -weight time value units -cholesterol time value -blood pressure time value -pump settings (will need track settings changes, should able queries) -timestamp -bg-target -active insulin time -carb ratios (changes throughout day basal) -sensitivity -active insulin time
concerns. 1) overarching 'events' table type, bring events in period of time without having query every single table? (disadvantage how work events duration? have optional end-time on event table?)
2) local database 1 user, , there never need compare or interact of records of other users if synced online, thinking of keeping 1 version of database per user, though maybe adding 'user' id uploaded.
3) many of events go ease of interpretation , analysis (blood sugar, meal, food, bolus, notes instance), gather it's better after fact queries rather hardcoding maintain integrity.
some info on database used for: -a visual representation of data types on course of day -average test results , percent of insulin used food, correction, basal. -as specific advanced queries like: list 20 examples of difference in glucose level between before bed glucose , morning glucose when no food eaten , no exercise w/in 2 hours of bed, since settings last changed, etc. -program automatically assign tags based on parameters. if >20 carbs eaten during assigned 'lunch' period, food lunch. if there 2 food intakes within 30 minutes (or 'meal length' preference) group them 1 meal.. not totally sure how function right now.
v1.0
relational databases, , sql (which designed them) perform better when data organised , normalised. 1 big table un-normalised, , crippled, in terms of both performance , relational power.
your requirement calls ordinary supertype-subtype cluster of tables. unfortunately ordinary relational structures such not "common".
the standard subtype symbol semicircle.
the cardinality of supertype::subtype 1::0-to-1.
the subtype primary key supertype primary key. foreign key supertype.
there 2 types:
exclusive, there 1 subtype each supertype row, denoted x through semicircle.
non-exclusive, there more 1 subtype per supertype row
yours exclusive. type needs discriminator, identify subtype active supertype row. number of subtypes small, indicators can used; otherwise classification table required.
note this, structures, rules, constraints, required support it, , provide data integrity, available in ordinary iec/iso/ansi sql. (the non-sqls not comply sql requirement).
data
naming important. advised name table row, not content or meaning or action. speak of events, can see readings.
there must context these readings or events. not see how eventid hangs in air. have assumed readings particular patient. please advise, , change model.
composite or compound keys normal. sql quite capable (the non-sqls aren't).
patientid
exists fk inreading
, , used form pk. there no need additionalreadingid
column , additional index, 100% redundant.sql quite capable of handling many tables (the database working on exceeds 500 tables), , large numbers of smaller tables nature of relational databases.
this pure fifth normal form (no columns duplicated; no update anomalies).
this can further normalised sixth normal form, , further benefits can gained; , 6nf can optimised, etc.; not required here.
some tables happen in 6nf, consequence, not intent, cannot declared such.
.
if provide information limits , overrides concern you, can provide model resolves issues.
since data is modelled, set fast comparisons (generating alarms, etc.).
readers unfamiliar standard modelling relational databases may find ▶idef1x notational◀ useful.
feel free ask clarifying questions, either comments, or edits question.
caveat
the oo , orm crowd (lead fowler , ambler) clueless relational technology , databases. designing objects quite different modelling data. if apply object design databases, end monstrosities need "re-factoring", , have buy yet "book" shows how efficiently. in meantime "database" crippled.
relational databases modelled correctly (as data, not objects) never need "re-factoring". in highly normalised databases, can add tables, columns , functions without having change existing data or code.
even concept of orm totally flawed. data has more permanence objects. if model data first, model objects data, stable. if model objects first (which weird anyway, without understanding of data), model data after objects, going , forth, correctly both.
relational databases have had ordinary structures such supertype-subtype on 30 years, , work well, if implemented that. not "gen-spec" or "class-inheritance" or such oo thing; , if oo or orm structures implemented, without modelling data correctly, "database" crippled, , need "we-factoring".
- additionally, not implement required data integrity constraints, data quality poor. not allow bad data enter database; "databases" full of bad data, , need "book" on how wash dirty data.
.
- additionally, not implement required data integrity constraints, data quality poor. not allow bad data enter database; "databases" full of bad data, , need "book" on how wash dirty data.
- they have sequence, , hierarchy, mixed up. done correctly, there no "impedance mismatch", no pseudo-technical names mask pure stupidity; justify doing same set of work on , on , on again.
so run hell using oo or orm terminology when dealing relational databases.
v1.1
your edit provides far more detail, of course demanded, because context, whole, necessary, if data modelled correctly. incorporates info. however, questions remain, , back-and-forth required before can complete. feel free ask questions not absolutely clear; not sure gap until throw up, , speak it.
all models pure relational (retain full relational power), idef1x compliant , fifth normal form (no update anomalies). rules (business or data/referential integrity) drawn in model can implemented declaration in iso/iec/ansi sql.
never hard-code anything. models not require that, , code working database not have that. fixed text normalised reference or look-up tables. (that bit incomplete,; need fill in gaps).
a short alphabetic code far better enumeration; once used it, values , meanings become recognisable.
because pks, , therefore stable, can safely code:
... eventtypecode = "p"
or
... eventtypecode "t%"
i believe datatypes self-evident or can worked out easily. if not, please ask.
everything note "finicky" valid. issue is, since have not had database engage with, did not know should in database vs should or can sql code. therefore "finicky" items have been provided (the database elements), need construct code. again, if there gap please ask.
- what saying is, working in traditional style of i data modeller, developer, have ensure every item perspective delivered, rather relying on me interpret notes. delivering database supports requirements that can glean notes.
.
- what saying is, working in traditional style of i data modeller, developer, have ensure every item perspective delivered, rather relying on me interpret notes. delivering database supports requirements that can glean notes.
one patient per database. let's allow possibility system successful, in future, have 1 central workhorse database, rather limiting 1 database per patient, nightmare administer. let's need keep patient details in 1 place, 1 version of truth. have provided. not limit in short term, implementing 1 db per patient; there no problem @ 1 row in patient table.
alternately, can strip
patientid
out of al tables, , when grow central database configuration, require major database upgrade.likewise, if have sensors or pumps need track, please identify attributes. sensor or pump attributes normalised tables. if "one per patient" that's fine, there 1 row in tables, unless need store history of sensors or pumps.
in v1.0 subtypes exclusive. non-exclusive. means tracking chronology of events, without duplication; , single event may consist of more 1 subtype. eg. notes can inserted event.
- before completion, eventtype list provided needs filed out in form of grid, showing (a) permitted (b) mandatory subtypes per eventtype. thate implemented check constraints in event.
.
- before completion, eventtype list provided needs filed out in form of grid, showing (a) permitted (b) mandatory subtypes per eventtype. thate implemented check constraints in event.
naming important. using iso standard 11179 (guidelines , principles) plus own conventions. reading type events prefixed such. feel free suggest changes.
units. traditionally, use either metric xor imperial across database, allow entry in whatever user likes, , convert before storage. if need mixture, @ least should have unittype specified @ patient or pump level, rather allowing storage of either unittype. if need either unittype stored, changing , forth, yes, need store unittype each such value.
temporal database. have times series being recorded, , interpreted via sql. big subject, read on it. minimum ask read , understand is:
▶temporal database performance (0nf vs 5nf)◀
▶classic 5nf temporal database◀ (inspect data model carefully)
basically issue boils down this:
either have true 5nf database, no data duplication, no update anomalies.
that means, continuous time series,
startdatetime
recorded.enddtatetime
derivedstartdatetime
of next row, not stored. eg. event continuos chronology; eventtype identifies whether event specificdatetime
or period/duration.enddatetime
stored disjoint periods, there legitimate gaps between periods; in case identified via eventtype. eg. exercise, pumpsuspended. (incidentally, suggesting patient knows actual, opposed planned, attributes, @ end of exercise period.)since there no
enddatetime
,startdatetime
datetime
. eg.eventdtm
this requires use of ordinary sql subqueries. quite simple once coder has grasp on subject. don't, have supplied full tutorial on subqueries in general, , using them in temporal context in particular, in:
▶it easy when know how◀. not coincidentally, re same classic 5nf temporal database above.
xor have database
enddatetime
stored (100% duplication) everystartdatetime
column, , can use flat, slow queries. lots of manipulating large result sets group bys, instead of small result sets. massive data duplication , update anomalies have been introduced, reducing database flat file, supply needs of coders limited ability (certainly not "ease of coding").therefore, consider , choose, long term only, because affects every code segment accessing temporal data. not want re-write halfway down track when realise maintaining update anomalies worse writing subqueries.
of course, provide explicit requirements support 5nf temporal database, correct datatypes, etc., support identified requirements.
further, if choose 0nf, provide fields, data model complete purpose.
in either case, need work out sql code required given query.
datatype handling important. not store time (hours, etc) integer or offset. store time or datetime datatype. if offset, store time since midnight. allow unrestricted sql, , date arithmetic functions.
task you. go through model carefully, , ensure that:
every non-key attribute has 1::1 relationship primary key
and not have relationship other pk (in other table)
and of course, check model , provide feedback.
question
given above explanations , guidance.
what
readingbasaltemperature.type
, list values please ?what hba1c ?
what keytone ?
do need (ie. duration/period enddatetime`):
- readingbasaltemperatureend
- readingbolusend
- basal pattern
- basaltemp pattern
- actually, pattern, , how derived/compared ?
how basaltemperatureend (or duration) determined
starting position is, there no need store active insulin duration. need define how
enddatetime
determined. based on that, if cannot derived, , or based on many factors or changes time, storingenddatetime
might good.the pump settings need clarification.
v1.2
ok, have incorporated information have proved in question , comments. here progressed data model.
there still issues resolved.
use percentage or rate only, not both additional indicator. 1 can derived other. using rate consistently.
... worry approach many days basal rate identical.. hence redundancy
that not "redundancy". storage of time series of facts, happen unchanging. queries required straight-forward.
however, in advanced use, yes, can avoid storing unchanged fact, , instead extend duration include new time interval.
i still not clear re explanation of basal temp. please study new model. first, patterns stored separately. second, recording basal temp start rate. need basal temp end (with rate) ?
"glucoseeventtype able have more 1 value per glucose result" needs more definition. don't worry
id
keys. tell me data. each readingglucoseblood, name result values, , glucoseeventtype apply to; mandatory , optional.pumphistory.insulinenddatetime ending instant duration. of course generic, starting instant whatever row compare to. should seconds or minutes since midnight 01 jan 1900.
check new event pk. incoming record identifies several events, need parse that, , insert each event-eventsubtype row, using same datetime.
except patient, there no
id
keys in database, none required far. refer parent full pk.
05 feb 11
no feedback received re v1.2.
a lot of data i'm getting being pulled external (and disorganized) csv groups event types under 1 row , has events on same second, granular gets
that easy overcome. however, means an instant not instant. now, walk through whole exercise, bottom line simple.
if need it, add sequenceno pk, make unique. suspect eventtypecode enough (there not more 1 eventtype per second). if not, let me know, , change moel.
retain meaning of instant instant, , avoid departing architectural requirements of temporal databases.
use eventtype afford uniqueness datetime pk.
- keep in mind eventtypecode deployed in event pk, not discriminator requirement, afford uniqueness. presence in pk of subtypes artefact, not of discriminator (which known virtue of subtype).
however there unnecessary complexity due non-exclusive subtype (there can more 1 subtype per supertype row).
therefore have changed exclusive subtype, deterministic. 1 eventtype per supertype row; max 1 subtype.
refer implementing referential integrity subtypes specific information re constraints, etc.
the change data model small warrant release. have updated v1.2 data model.
06 mar 11
due upholding "above all, technically honest" guideline in faq, , confronting misinformation requested therein, suspended efforts (which means no longer correct misinformation on so, , such posters have protected reign). interaction seeker carried on, completion, , final data model completed, away so. progression therefore lost readers. however, may of value post result, ▶final data model v1.16◀.
- events have starting instant (event.datetime).
- events may durations, in case ending instant (event) required.
- some events consist of supertype; others require subtype. identified in third column of eventtype exposition.
- the fourth column identifies type of event:
- instant or duration
- duration: conjunct or disjunct
- note resolution of datetime on seeker's platform 1 second, , many events may occur in 1 second, not more 1 of same eventtype. eventtypecode has therefore been included in event primary key implement rule. artefact, not generic requirement supertype-subtype structure or exclusive/non-exclusive subtypes.
- intended printing on 2 facing letter pages, enlarged or not, gusset.
Comments
Post a Comment