Making the QTY value to multiple rows

gbrown · 2024-04-17T16:49:23+00:00

There was an error rendering this rich post.

Hello,

I am trying to take the data below and make the 'ORDQTY' column multiple rows based on Columns C and T and U. How would I do this? The end goal would be for row 2 would have two identical rows since there are a quantity of 2. Row 7 would have 4 identical rows and so-on. I am trying to have 1 unique row based on quantity.

pLANT ssd ORDQTY

BBQ-Built-In 04/05/2024 1

Quick Links

Accepted answers

Chris_Wolman

Hi @gbrown ,

You will need a dataset in Domo that contains 2 columns. One column to be joined on and another column with a value. You will notice that there are 2 rows for 2 and 3 rows for 3 and so on. As you could imagine, this table can be quite large if you need a large number of rows to expand. Here is an example of mine.

Here is my sample starting data.

You will need to join your ORDQTY column to the join_col column.

This will result in a row for each qty.

Enjoy!

rco

When the join 'on' expression feature becomes available to you there will be a better way to do this, but for now, here is the best I can come up with:

The general strategy is to make a table that contains all the unique values of ORDQTY repeated a number of times equal to that value. Then we will join that table to our original data on ORDQTY, which will effectively multiply those original rows the correct number of times.

First, we'll want to turn ORDQTY into an integer or, if the decimal/float value needs to be preserved, then make a new column containing ORDQTY as an integer ( CAST(ORDQTY AS LONG) ). We will branch from here, with one branch going straight into our final join and the other branch making our repeats table before heading to the final join. To make the repeats table:

1. Select just ORDQTY

2. Use Remove Duplicates to get down to just one row per unique ORDQTY

3. Now we need to multiply this table many times to ensure that we have enough rows to start with before filtering them down to the proper quantities. The best way to do this is a combination of self-unions (using Append Rows) and self-joins on ORDQTY (using Join Data). Self-unions will double the number of repeats, and self-joins on ORDQTY will square the number of repeats. You start with one repeat of each value (since we used Remove Duplicates). Decide on a reasonable theoretical maximum for ORDQTY and come up with a combination of doubles and squares that get you from one to that number. For example, if the maximum possible quantity is 1000, you could double twice (bringing you to four), then square twice (bringing you to 256), then double twice more (bringing you to 1024 repeats). Be careful of squaring your repeats too far; with just five self-joins you would be producing quintillions of rows!

To perform a self-union or a self-join, you need to add do-nothing Selects in between, because you can't connect one tile to another twice.

4. Now add a constant zero to fulfill the need for a "sort" column in the Rank & Window coming next.

5. Add a Rank & Window, partition on ORDQTY, sort by the dummy zero constant, and add the function Row Number in a new column "ROWNUM".

6. Add a Filter with the predicate ORDQTY >= ROWNUM. This will cause every ORDQTY value to be repeated only as many times as its value, where before all of them had been repeated the same large number of times.

7. Select only the ORDQTY column.

Finally, you can inner join this repeats table to the original input on the ORDQTY column. You can drop this column from either side of the join. This join will repeat-multiply the rows by their ORDQTY value, because of the way we constructed the repeats table.

In the end, the flow looks something like this:

JerryZ

You need an OrdQty Line file like the following. For every Increment in OrdQty, a new line is required.
Join your Order file and the Line file together
Input Screenshot
Output Screenshot

a Python script to create an OrdQty Line file, the sample is up to 50 (Qty)

import pandas as pd
Using a nested loop to create the data as described
data_nested = []

for i in range(1, 51):

for j in range(1, i+1):

data_nested.append({'ORDQTY': i, 'Line Number': j})
Convert to DataFrame
df_nested = pd.DataFrame(data_nested)
Save to CSV file
csv_path_nested = '/mnt/data/ORDQTY_Nested_Data.csv'

df_nested.to_csv(csv_path_nested, index=False)
csv_path_nested

ORDQTY_Nested_Data_.xlsx

All comments