Wednesday, March 9, 2011

Stock Ticks Database Schema

This is what the stock ticks DB schema looks like.

CREATE table eod
(
        day date not null,
        symbol char(6) not null,
        open real not null,
        high real not null,
        low real not null,
        close real not null,
        volume bigint not null
) without oids;

CREATE INDEX eod_idx1 ON eod (day,symbol);
CREATE INDEX eod_idx2 ON eod (symbol);
CREATE INDEX eod_idx3 ON eod (open);
CREATE INDEX eod_idx4 ON eod (high);
CREATE INDEX eod_idx5 ON eod (low);
CREATE INDEX eod_idx6 ON eod (close);
CREATE INDEX eod_idx7 ON eod (volume);

CREATE table fundamentals
(
        day timestamp not null,
        symbol char(6) not null,
        industry text,
        sector text,
        shares bigint,
        pe real,
        pb real,
        de real,
        eps real
) without oids;

CREATE INDEX fund_idx1 ON fundamentals (day,symbol);
CREATE INDEX fund_idx2 ON fundamentals (symbol);
CREATE INDEX fund_idx3 ON fundamentals (industry);
CREATE INDEX fund_idx4 ON fundamentals (sector);
CREATE INDEX fund_idx5 ON fundamentals (shares);
CREATE INDEX fund_idx6 ON fundamentals (pe);
CREATE INDEX fund_idx7 ON fundamentals (pb);
CREATE INDEX fund_idx8 ON fundamentals (de);
CREATE INDEX fund_idx9 ON fundamentals (eps);

CREATE type adjtype AS ENUM ('split', 'dividend');

CREATE table adjustments
(
        day date not null,
        symbol char(6) not null,
        type adjtype not null,
        value1 text,
        value2 text,
        value3 text
) without oids;

CREATE INDEX adj_idx1 ON adjustments (day,symbol);

CREATE table info
(
        symbol char(6) not null,
        name text,
        gics text,
        lastupdate_adjustments timestamp,
        lastupdate_eod timestamp,
        lastupdate_fundamentals timestamp,
        lastupdate_info timestamp
) without oids;

CREATE INDEX inf_idx1 ON info (symbol);
CREATE INDEX inf_idx2 ON info (name);
CREATE INDEX inf_idx3 ON info (gics);
CREATE INDEX inf_idx4 ON info (lastupdate_adjustments);
CREATE INDEX inf_idx5 ON info (lastupdate_eod);
CREATE INDEX inf_idx6 ON info (lastupdate_fundamentals);
CREATE INDEX inf_idx7 ON info (lastupdate_info);


As you can see it will store 4 types of data; end of day prices (eod), fundamental indicators, price adjustments and general information. In future I may introduce intraday prices however that remains a long term goal. I need to walk before I can swim.

You should also notice I will not be storing adjusted end of day prices. Instead, adjustments will be applied in real ltime. This is because adjustments are retrospective. In other words, an adjustment that occurs tomorrow will have an impact on all historical prices. Maintaining a table of adjusted prices would require updates to the affected symbols rows every time a new adjustment is made. As the number of rows increase this could overwhelm the system.

To implement this feature I will use PL/SQL integrated with memcached to store adjusted prices in memory after they are first requested and calculated. On subsequent requests if the last adjustments date in the info table does not change the adjusted prices will be recalled from memory and bypass storage.

This design (I hope) should ensure the system is able to deliver adjusted prices quickly on second and subsequent requests and should be able to scale to intraday requirements easily.

2 comments:

  1. What's the point of creating indexes on numeric value fields such as pe? It seems unlikely you would ever use a particular pe to find a row. Is the purpose to make searches on ranges of pe faster, eg. pe>0 and pe <15 ?

    ReplyDelete
  2. Yes that is correct. The index will allow fast searches within a range.

    ReplyDelete