jeudi 21 octobre 2021

ValueError: Sample larger than population or is negative when trying to randomly swap elements in a list

I am taking multiple different different sorting algos and benchmarking their times based on different sizes of arrays and different sorting of arrays. IE sorted, random, and reverse arrays ranging from 10^0 to 10^8. I am getting an error whenever I try to use a random_swap function within the main benchmarking function, and I cannot understand why.

My main code looks like this:

def benchmark (sorting_functions,
          base,
          power,
          seed,
          runs = 5):
results = pd.DataFrame(columns = ['sorting algo', 'order', 'size', 'run', 'time'])

for order in ['sorted', 'reverse', 'random', 'semi_sorted']:
    for x in range(power + 1):
 #generate random data
            rng = np.random.default_rng(seed)
            testing_data = rng.uniform(size=base**x)
            print(testing_data)
        
            #sorting of data
            if order == 'sorted':
                testing_data = sorted(testing_data)
                print(testing_data)
            
            if order == 'reverse':
                testing_data = sorted(testing_data, reverse = True)
                print(testing_data)

            #error here!!!!!    
            if order == 'semi_sorted':
                testing_data = swap_random(testing_data, 2)
            
        # Timer function
            clock = timeit.Timer(stmt='sorting_function(copy(data))',
                                 globals={
                                     'sorting_function': sorting_functions,
                                     'data': testing_data,
                                     'copy': copy.copy
                                 })
            n_ar, t_ar = clock.autorange()
            t = clock.repeat(repeat=5, number=n_ar)
            
            
            
    for run in range(runs):
        results = results.append(
            {'sorting algo': f'{sorting_functions.__name__}', 
             'order': order, 'size': base**x, 'run': run + 1, 'time': t[run] / n_ar},
            ignore_index = True)
            
print(results)

and then my function thats getting the error is:

def swap_random(seq, num_swaps):
    import random
    # looping to do as many random swaps as wanted
    # getting the index of the list
    index = range(len(seq))
    for i in range(num_swaps):
        #choosing two random values
        i1, i2 = random.sample(index, num_swaps)
        print(i1, i2)
        #swapping them
        seq[i1], seq[i2] = seq[i2], seq[i1]
    
    return seq

the error im getting is:

ValueError                                Traceback (most recent call last)
<ipython-input-22-c87ac57f742a> in <module>
     12 
     13 for title, sort in sorting_functions.items():
---> 14     benchmark(sort, base=10, power=1, seed=5)

<ipython-input-19-b03c2fcec9dc> in benchmark(sorting_functions, base, power, seed, runs)
     50 
     51                 if order == 'semi_sorted':
---> 52                     testing_data = swap_random(testing_data, 2)
     53 
     54             # Timer function

<ipython-input-18-b813edad15a9> in swap_random(seq, num_swaps)
     14     for i in range(num_swaps):
     15         #choosing two random values
---> 16         i1, i2 = random.sample(index, 2)
     17         print(i1, i2)
     18         #swapping them

~\anaconda3\lib\random.py in sample(self, population, k)
    361         n = len(population)
    362         if not 0 <= k <= n:
--> 363             raise ValueError("Sample larger than population or is negative")
    364         result = [None] * k
    365         setsize = 21        # size of a small set minus size of an empty list

ValueError: Sample larger than population or is negative

I would greatly appreciate any help with this. Thank you so much.




Aucun commentaire:

Enregistrer un commentaire