I have written a function where I am trying to randomly select 303 data points from a total of 506 data points. After selecting 303 points randomly I am randomly extracting 203 data points. Using this technique I want to create 30 samples from a dataset (boston dataset).
Please take this example to get more clarity on the above procedure:
assume we have 10 data points [1,2,3,4,5,6,7,8,9,10], first we take 6 data points randomly , consider we have selected [4, 5, 7, 8, 9, 3] now we will replicate 4 points from [4, 5, 7, 8, 9, 3], consider they are [5, 8, 3,7] so our final sample will be [4, 5, 7, 8, 9, 3, 5, 8, 3,7].
I have written following code for this purpose
def generating_samples(input_data, target_data):
selecting_rows = np.random.choice(len(input_data), 303)
replacing_rows = np.random.choice(selecting_rows,203, replace=False)
selecting_columns = np.random.choice(3,13,1)
sample_data = input_data[selecting_rows[:,None],selecting_columns]
target_of_sample_data = target_data[selecting_rows]
#replicating data
replicated_sample_data = input_data[replacing_rows]
target_of_replicated_sample_data = target_data[replacing_rows]
#concatenating data
final_sample_data = np.vstack((sample_data, replicated_sample_data))
final_target_data = np.vstack((target_of_sample_data.reshape(-1,1), target_of_replicated_sample_data.reshape(-1,1)))
return final_sample_data , final_target_data, selecting_rows,selecting_columns
The below is the grader function which can be used to evaluate this code
def grader_samples(a,b,c,d):
length = (len(a)==506 and len(b)==506)
sampled = (len(a)-len(set([str(i) for i in a]))==203)
rows_length = (len(c)==303)
column_length= (len(d)>=3)
assert(length and sampled and rows_length and column_length)
return True
Here I am getting True for a, b, c and d. But I am getting assertion error because of
sampled = (len(a)-len(set([str(i) for i in a]))==203)
the above statement
The value should match 203 but its not matching. Can someone help me with this issue.
Aucun commentaire:
Enregistrer un commentaire