Hello I have dataframe that looks like the below. The data is basically capacity information of a cluster system. This is like a small sample of it. You may notice that the days_to_full column fluctuates a lot within a span of 20 days. The Threshold_date column is date when the disk will be full based on adding the days_to_full to the Datetime column.
If I were to build a model that could oversee the high fluctuations in the threshold_date and notify me only when the growth is fairly linear or basically when the growth is real. How would I do that? I would like to know at least 90 to 100 days in advance that I could be going full for real. But such fluctuations could give me wrong predications and cause panic for no reason.
I've been trying to read a lot of articles but unable to identify what (mathematical)model using python would help me make better estimations.
Appreciate if some one could explain why you suggest a model when you suggest it and what are the parameters that could be looked at to make it as optimal in predication as possible. Thanks in advance!
cluster_name region Datetime total_data_capacity used_data_capacity available_data_capacity days_to_full Threshold_date
cluster01 lon 2021-03-02 29990.745869 23540.127364 6450.618505 219.000000 2021-11-14
cluster01 lon 2021-03-03 29990.745869 23555.783363 6434.962505 219.000000 2021-11-14
cluster01 lon 2021-03-04 29990.745869 23572.610517 6418.135352 219.833333 2021-11-14
cluster01 lon 2021-03-05 29990.745869 23589.994672 6400.751197 220.000000 2021-11-15
cluster01 lon 2021-03-06 29990.745869 23608.169950 6382.575918 220.000000 2021-11-15
cluster01 lon 2021-03-07 29990.745869 23612.727373 6378.018496 220.000000 2021-11-15
cluster01 lon 2021-03-08 29990.745869 23621.996424 6368.749444 220.000000 2021-11-15
cluster01 lon 2021-03-09 29990.745869 23642.840187 6347.905682 926.285714 2023-10-22
cluster01 lon 2021-03-10 29990.745869 23663.032472 6327.713397 1044.000000 2024-02-17
cluster01 lon 2021-03-11 29990.745869 23682.244640 6308.501229 1004.833333 2024-01-08
cluster01 lon 2021-03-12 29990.745869 23703.716183 6287.029686 997.000000 2024-01-01
cluster01 lon 2021-03-13 29990.745869 23723.670334 6267.075534 997.000000 2024-01-01
cluster01 lon 2021-03-14 29990.745869 23726.441732 6264.304136 997.000000 2024-01-01
cluster01 lon 2021-03-15 29990.745869 23638.685020 6352.060849 997.000000 2024-01-01
cluster01 lon 2021-03-16 29990.745869 23607.307080 6383.438789 1022.000000 2024-01-26
cluster01 lon 2021-03-17 29990.745869 23649.954446 6340.791423 1027.000000 2024-01-31
cluster01 lon 2021-03-18 29990.745869 23694.870332 6295.875536 991.545455 2023-12-26
cluster01 lon 2021-03-19 29990.745869 23739.976639 6250.769230 988.000000 2023-12-23
Aucun commentaire:
Enregistrer un commentaire