Hi,
I am trying to get data for everyday in a particular date range even if there is no data for that day.
For example,
Item | Location | Date | Count |
A | USA | Jan 3, 2018 | 5 |
A | USA | Feb 10, 2018 | 12 |
A | USA | Feb 25, 2018 | 13 |
A | Canada | Jan 7, 2018 | 14 |
A | Canada | Feb 10, 2018 | 10 |
A | Canada | Mar 3, 2018 | 5 |
| | | |
Desired O/P:
Item | Location | Date | Count |
A | USA | Jan 1, 2018 | 0 |
A | USA | Jan 2, 2018 | 0 |
A | USA | Jan 3, 2018 | 5 |
A | USA | Jan 4, 2018 | 5 |
A | USA | Jan 5, 2018 | 5 |
A | USA | Jan 6, 2018 ….Feb 9,2018 | 5 |
A | USA | Feb 10, 2018 | 12 |
A | USA | Feb 11 …Feb 24 | 12 |
A | USA | Feb 25 | 13 |
A | USA | Feb 26… Current Day | 13 |
A | Canada | Jan 1.. Jan 6 | 0 |
A | Canada | Jan 7, 2018 | 14 |
A | Canada | Jan 8…Feb 9 | 14 |
A | Canada | Feb 10, 2018 | 10 |
A | Canada | Feb 11 – Mar 2 | 10 |
A | Canada | Mar 3 | 5 |
A | Canada | Mar 4 – Current Day | 5 |
I came up with this SQL Procedure:
CREATE PROCEDURE add_missing_days()
BEGIN
DECLARE DONE INT DEFAULT 0;
DECLARE ITEMID INTEGER;
DECLARE LOCID integer;
DECLARE FND INTEGER;
DECLARE DT DATE;
DECLARE END_DT DATE;
DECLARE PREV_RATE VARCHAR(255);
DECLARE item_ids CURSOR FOR SELECT DISTINCT `ITEM_ID`,`LOCATION_ID` FROM `transform_data_1` ORDER BY `ITEM_ID`,`LOCATION_ID`, `Transaction Date` ASC;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET DONE = 1;
OPEN item_ids;
item_ids_loop : LOOP
FETCH item_ids INTO ITEMID, LOCID;
IF DONE = 1
THEN CLOSE item_ids;
LEAVE item_ids_loop;
END IF;
SET DT = str_to_date('2018-01-01', '%Y-%m-%d');
SET END_DT = CURRENT_DATE;
WHILE DT < END_DT DO
SET PREV_RATE = 0;
SET FND = ( SELECT COUNT(*) FROM `transform_data_1`
WHERE `ITEM_ID` = ITEMID AND `LOCATION_ID` = LOCID and `Transaction Date` = DT );
IF ( FND = 0 ) THEN
INSERT INTO `transform_data_1`( `ITEM_ID`, `LOCATION_ID`, `Transaction Date`, `Inventory Count` )
VALUES( ITEMID, LOCID, DT, PREV_RATE );
ELSE
SET PREV_RATE = ( SELECT `Inventory Count` FROM `transform_data_1`
WHERE `ITEM_ID` = ITEMID AND `LOCATION_ID` = LOCID AND `Transaction Date` = DT );
END IF;
SET DT = DATE_ADD( DT, INTERVAL 1 DAY );
END WHILE;
END LOOP item_ids_loop;
END
This code works for just 1 month Date range . But if I try running it for the entire date range , it times-out and fails. Considering I am working on a big dataset and adding a row for each day for each Item and Location it takes a lot of processing time. For now I have broken the dataflow into different months and eventually combining all the dataflows into one.
However this is not dynamic and I will have to keep adding new dataflows for each month over time. I just want to achieve this in a single dataflow.
So I was wondering if there was a more efficient way to achieve this.
Thanks,
Prajwal