Calculating Average Time Between First/Last Login

Hi,

 

I have a table of user logins that has user id in one column and the date they logged in in another. I'm trying to calculate how long users typically login before they become inactive. Ideally, I'd be able to map number of months on one access and number of users on the other.

 

What type of operations would I need to achieve this? I know I'd need to return the first and last login for each distinct user, do a time differential (as a whole month) and then count how many users fall under each.

 

Input Table:

Screen Shot 2017-10-17 at 10.44.36 AM.png

 

Output:

Screen Shot 2017-10-17 at 10.49.31 AM.png

Best Answers

  • RGranada
    RGranada Contributor
    Answer ✓

    Hi,

     

    If I correctly understood your problem, here's the approach I would take. 

     

    It involves creating a MySql Dataflow like this (you can use this directly on the output if you do not have other transformations implied):

     

    SELECT i4w_userid,Max(i4w_date) as First_Login_Date,Min(i4w_date) as Last_Login_Date, TIMESTAMPDIFF(MONTH,Min(i4w_date),Max(i4w_date)) AS Month_Diferential, Count(i4w_userid) AS Number_of_Logins FROM your_dataset_name
    GROUP BY i4w_userid

     

    This will give the necessary fields and more some to achieve your results.

     

    Then set up a card like the following:

    DOJO_Logins.png

     

    Don't forget to set the calculation of i4w_userid to "Count".

     

    Hope this helps. Let me know if you need some more tweaks on this.

    Ricardo Granada 

    MajorDomo@Lusiaves

    **If the post solves your problem, mark it by clicking on "Accept as Solution"
    **You can say "Thank you" by clicking the thumbs up in the post that helped you.
  • RGranada
    RGranada Contributor
    Answer ✓

    Hi @ST_-Superman-_, you are right on the count aggregation... As to the MAX(i4w_date)  I also found it odd as it was returning the oldest date not the newest one... that was why I swapped it. But I could have missed something.

     

    Best Regards,

    Ricardo Granada 

    MajorDomo@Lusiaves

    **If the post solves your problem, mark it by clicking on "Accept as Solution"
    **You can say "Thank you" by clicking the thumbs up in the post that helped you.

Answers

  • RGranada
    RGranada Contributor

    Hi,

     

    Can you send a sample of your data in excel?

     

    Regards,

    Ricardo Granada 

    MajorDomo@Lusiaves

    **If the post solves your problem, mark it by clicking on "Accept as Solution"
    **You can say "Thank you" by clicking the thumbs up in the post that helped you.
  • Yes, of course. Each time a user logs in, it's a new row with that ID and the date/timestamp. So a user may have 40 total logins, each as a separate row entry.

  • RGranada
    RGranada Contributor
    Answer ✓

    Hi,

     

    If I correctly understood your problem, here's the approach I would take. 

     

    It involves creating a MySql Dataflow like this (you can use this directly on the output if you do not have other transformations implied):

     

    SELECT i4w_userid,Max(i4w_date) as First_Login_Date,Min(i4w_date) as Last_Login_Date, TIMESTAMPDIFF(MONTH,Min(i4w_date),Max(i4w_date)) AS Month_Diferential, Count(i4w_userid) AS Number_of_Logins FROM your_dataset_name
    GROUP BY i4w_userid

     

    This will give the necessary fields and more some to achieve your results.

     

    Then set up a card like the following:

    DOJO_Logins.png

     

    Don't forget to set the calculation of i4w_userid to "Count".

     

    Hope this helps. Let me know if you need some more tweaks on this.

    Ricardo Granada 

    MajorDomo@Lusiaves

    **If the post solves your problem, mark it by clicking on "Accept as Solution"
    **You can say "Thank you" by clicking the thumbs up in the post that helped you.
  • Nicely done @RGranada

     

    One change that I would make, although it would not impact this card at all, is that MAX(i4w_date) should be "Last_Login_Date" and the MIN(i4w_date) should be "First_Login_Date".

     

    Also, if you group it by i4w_userid; then the last part of your select statement would just return 1's so I'm not sure that I would include it.  Unless you are wanting to do some math calculations and want a numeric value in your data set.


    “There is a superhero in all of us, we just need the courage to put on the cape.” -Superman
  • RGranada
    RGranada Contributor
    Answer ✓

    Hi @ST_-Superman-_, you are right on the count aggregation... As to the MAX(i4w_date)  I also found it odd as it was returning the oldest date not the newest one... that was why I swapped it. But I could have missed something.

     

    Best Regards,

    Ricardo Granada 

    MajorDomo@Lusiaves

    **If the post solves your problem, mark it by clicking on "Accept as Solution"
    **You can say "Thank you" by clicking the thumbs up in the post that helped you.
  • Thank you both, this is incredibly helpful and I've learned something new ?