forecasting:large_sets_of_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
forecasting:large_sets_of_data [2015/12/08 04:01]
jeremygg
forecasting:large_sets_of_data [2021/09/19 21:59] (current)
Line 13: Line 13:
  
 ===Solution 1)=== ===Solution 1)===
- Note there is many solution. One solution is to take your first the very first offer, maybe the first offer that you are content with over some value $Y, and you leave to the next bar.+ Note there is many solutions/​plan of attacks. One solution is to take your first the very first offer, maybe the first offer that you are content with over some value $Y, and you leave to the next bar
 + 
 +The hardest part is you don't know the distribution of the amount of money each person is willing to give you. If you wait for $Y, maybe not a single is someone willing to give you that much. Maybe there is a person willing to give you that much, but there is a person willing to give you double that amount! We are going to make a couple assumptions. The place is so popular that there will always be a constant flow of people coming out of the bar. Another assumption is he is a committed beggar with lots of time and willing to stick there for a finite or infinite amount of time. Thirdly, the amount of money given when each person that is asked are independent of one another. Since the smallest amount of money is a penny, the distribution X (The number of money a person is willing to give) is technically discrete.  
 + 
 +Lets say the maximum a person is willing to give is y<​sub>​m</​sub>​. The probability of someone asking to give you the max amount of money is P<​sub>​X</​sub>​(X=y<​sub>​m</​sub>​). or the CDF 1-F<​sub>​X</​sub>​(X=y<​sub>​m</​sub>​-.01). The probability of getting the maximum in the first n people would be equal to P<​sub>​X</​sub>​(X=y<​sub>​m</​sub>​)<​sup>​n</​sup>​. The probability of the maximum offer within the first n people being y<​sub>​l</​sub>​ F<​sub>​X</​sub>​(X=y<​sub>​l</​sub>​)<​sup>​n</​sup>​. Try this with most distributions that met the assumptions and if n is large, the probability gets larger if y<​sub>​l</​sub>​ gets large. A good plan of attack is to ask the first n people for money, but don't take any. Then after n people, take the first offer that matches or meets that maximum. A flaw is if you were very very unlucky and you got stingy people for your first n, this would be unlikely if you get n large enough (As long as he doesn'​t die form old age.) Secondly if everyone who was going to give you an amount of y<​sub>​l</​sub>​ or greater has already passed and no one else would give you. The guy is stuck forever, but since we assume independence and a constant flow of people, the probability of this happening is 0. Lastly is you may want to find the n that is optimal where its the time he should spend waiting versus cash out which goes against the fact that he is a committed beggar.
  
 ===Problem 2)=== ===Problem 2)===
-Suppose you are in-charge ​of +You are part of a game show where people can join in and out of the game at any time. The show host picks a ball with a random number j on it which the numbers come from and may repeat from a certain distribution J. Prior to the picking, people give an X amount of money to the host. If you gave the host less than the amount of the number on the ball, you are out of the game. You want to stay on the show for as long as you can for whatever reasons.You decided to just watch for n rounds and keep track of the amount of numbers (Lets call these values X<​sub>​1</​sub>​ to X<​sub>​n</​sub>​). How much would you give to the show host every round?
  
 ===Solution 2)=== ===Solution 2)===
 +Of course you can spend exactly infinity dollars every round and never get kicked off, but lets excuse this possibility. You could pick a large number C every single round. The probability that you'll get kicked off the show where n is the number of rounds would be *Insert pick of equation sum{0 to inf} (P<​sub>​J</​sub>​(j<​=C))<​sup>​n</​sup>​P<​sub>​J</​sub>​(j<​C) = sum{0 to inf} (F<​sub>​J</​sub>​(j=X)<​sup>​n</​sup>​)(1-F<​sub>​J</​sub>​(j=X)).* If J doesn'​t have a finite maximum value then this is relative to the sum of a geometric probability function (You should review geometric random variables Ex: Flipping coins). This means the probability that you will lose eventually is 1. This solution isn't necessarily bad but lets look at more.
  
 +Second solution is to keep increasing the maximum value. Let's call the function G(n). Considering that J may be infinity, it is impossible to have a 0 percent probability that you won't get kicked off. But we can say we would at-most want a probability is ε. So that means you want to design G(n) such that *Insert equation ε <= sum(n 0 to inf)P<​sub>​J</​sub>​(j<​=G(n))<​sup>​n</​sup>​P<​sub>​J</​sub>​(j>​G(n)) = sum{0 to inf} (1-F<​sub>​J</​sub>​(j=G(n)))<​sup>​n</​sup>​F<​sub>​J</​sub>​(j=G(n)).
  
-The hardest part is you don't know the distribution of the amount of money each person is willing to give you. If you wait for $Ymaybe not a single is someone willing to give you that much. Maybe there is a person willing ​to give you that much, but there is a person willing ​to give you double that amount! We are going to make a couple assumptionsThe place is so popular ​that there will always be constant flow of people coming out of the bar. Another assumption is he is a committed beggar with lots of time and willing to stick there for a finite or infinite amount of time. Thirdly, the amount of money given when each person ​that is asked are independent of one another. Since the smallest amount of money is a penny, ​the distribution X (The number ​of money a person is willing to give) is technically discrete.  +Finally ​you realizewell you are going to lose eventually, but you want to have control the rate you will leaveYou realize ​that the G(n) increases and you will lose way too much money at rate too fast. So from using what we learned from the first problem, we can just observe ​for a long time, as n gets very large, we can assure that we can pick the right C from the first scenario. The chances ​that you picked ​the wrong C simply by looking at the previous maxes would be the probability that none of the numbers observed was greater than the maximum(X<sub>i</​sub> ​multiplied by the probability you'll fail which is less than the following equation since C is even larger than that max *Insert equation F<sub>J</​sub>​(j=C)<sup>n</sup>sum{k 0 to inf} (P<​sub>​J</​sub>​(j<=C))<​sup>​k</​sup>​P<sub>J</​sub>​(j>C) = F<​sub>​J</​sub>​(j=C)<​sup>​n</​sup>​. ​Note that this is either a very small probability ​if is not sufficiently ​large enoughbut a high probability overall but the probability ​of losing a single round becomes very smallIt is also good to note that the larger C is compared ​to the maximum of the X<sub>i</​sub>​, it will decrease the probability,​ but increase your cost tooAn '​OK'​ assumption to make is the probability of a number large number n between ​0 to infinity doesn'​t increase as gets larger, ​which is not necessarily true but it is very common in most probability distribution function
- +
-Lets say the maximum ​a person is willing to give is y<sub>m</​sub>​. The probability ​of someone asking to give you the max amount of money is P<sub>X</​sub>​(X=y<​sub>​m</​sub>​). or the CDF 1-F<sub>X</sub>(X=y<​sub>​m</​sub>​-.01). The probability of getting the maximum in the first n people would be equal to P<​sub>​X</​sub>​(X=y<​sub>​m</​sub>​)<​sup>​n</​sup>​. The probability of the maximum offer within the first n people being y<sub>l</​sub>​ F<​sub>​X</​sub>​(X=y<​sub>​l</​sub>​)<​sup>​n</​sup>​. ​Try this with most distributions that met the assumptions and if is large, the probability ​gets larger if y<​sub>​l</​sub>​ gets largegood plan of attack ​is to ask the first n people for money, but don't take any. Then after n people, take the first offer that matches or meets that maximum. A flaw is if you were very very unlucky and you got stingy people for your first n, this would be unlikely if you get n large enough (As long as he doesn'​t die form old age.) Secondly if everyone who was going to give you an amount ​of y<sub>l</​sub> ​or greater has already passed and no one else would give youThe guy is stuck forever, but since we assume independence and a constant flow of people, ​the probability of this happening is 0. Lastly is you may want to find the that is optimal where its the time he should spend waiting versus cash out which goes against the fact that he is a committed beggar.+
  
 +A final method if you don't want to sample long enough is to take the mean of the current data you have and make sure your number is greater than the mean by a r number of deviations where you can pick r for probability vs cost and assume J is Gaussian like even though it probably isn't. THE MAIN POINT OF THIS THOUGH IS THE METHOD BEFORE THIS ONE, OBSERVING PAST VALUES IS A GOOD WAY TO REASSURE YOURSELF!
  
 ===Major Complications=== ===Major Complications===
  • forecasting/large_sets_of_data.1449547297.txt.gz
  • Last modified: 2021/09/19 21:59
  • (external edit)