vendredi 3 avril 2020

Using vectorization to generate properly spaced random arrays

I'm working on a statistical mechanics class in python through Coursera. I've been trying to work out a problem where we see how many of particular valid states are occupied. I feel like my method should be mathematically equivalent to the one given, but it's not giving me the same answers (statistically, the number of occupied states is far too low in my method) and it's going more slowly. Can somebody help me with what I'm missing?

The two functions are:

def disk_sampling_theirs(N, sigma):
    condition = False
    while condition == False:
        L = [(random.uniform(sigma, 1.0 - sigma), random.uniform(sigma, 1.0 - sigma))]
        # generate a random pair of points
        for k in range(1, N):
            #generate a new second pair of points
            a = (random.uniform(sigma, 1.0 - sigma), random.uniform(sigma, 1.0 - sigma))

            #check to see that none of the points are within 2*sigma of any of the pairs in L
            min_dist = min(math.sqrt((a[0] - b[0]) ** 2 + (a[1] - b[1]) ** 2) for b in L)

            if min_dist < 2.0 * sigma:
                #if the points are too close, throw everything out and start over with new L
                condition = False
                break
            else:
                # add a to L, and go to next point.  If condition = True (stays true) for the whole set of points, then you can end the while loop.
                L.append(a)
                condition = True
    return L

My vectorized version, using linalg.norm is:

def disk_sampling_mine(N, sigma):
    condition = False
    while condition == False:
        L = np.random.uniform(low=sigma, high=1-sigma, size=(1,2))
        for _ in range(1,N):
            a = np.random.uniform(low=sigma, high=1-sigma, size=(1,2))
            L = np.append(L,a, axis=0)
            min_dist = min((np.linalg.norm(p[0] - p[1]) 
                           for p in itertools.combinations(L, r=2)))
            if min_dist < 4 * sigma**2:
                condition = False
                break
            else:
                condition = True
    return list(L)

Then I check the code using the script they've given

sigma = 0.15
del_xy = 0.05
n_runs = 100000
conf_a = ((0.30, 0.30), (0.30, 0.70), (0.70, 0.30), (0.70,0.70))
conf_b = ((0.20, 0.20), (0.20, 0.80), (0.75, 0.25), (0.75,0.75))
conf_c = ((0.30, 0.20), (0.30, 0.80), (0.70, 0.20), (0.70,0.70))
configurations = [conf_a, conf_b, conf_c]
hits = {conf_a: 0, conf_b: 0, conf_c: 0}
for run in range(n_runs):
    x_vec = disk_sampling_theirs(4, sigma)
    for conf in configurations:
        condition_hit = True
        for b in conf:
            condition_b = min(max(abs(a[0] - b[0]), abs(a[1] - b[1])) for a in x_vec) < del_xy
            condition_hit *= condition_b
        if condition_hit:
            hits[conf] += 1

for conf in configurations:
    print (conf, hits[conf])

And for my trio of states I get: (12, 20, 17), their method (roughly, they change a bit) (1, 0, 0) my method (I rarely get more than 1 or 2 hits in total)

What am I missing?




Aucun commentaire:

Enregistrer un commentaire