I have written a program that will Generate large data set and randomize it according to conditions Please Go through my whole program and conditions which i will write here if any thing which is not clear for you please ping me...
Input data:
Asset_Id Asset Family Asset Name Location Asset Component Keywords
1 Haptic Analy HAL1 Zenoa Tetris Measuring Unit Measurement Inaccuracy,
1 Haptic Analy HAL2 Zenoa Micro Pressure Platform Low air pressure,
1 Haptic Analy HAL3 Technopolis Rotation Chamber Rotation Chamber Intermittent Slowdown
1 Haptic Analy HAL4 Technopolis Mirror Lens Combinator Mirror Angle Skewed,
2 Hyperdome Insp HISP1 Technopolis Laser Column Column Alignment Issues,
2 Hyperdome Insp HISP2 Zenoa Turbo Quantifier Quantifier output Drops Intermittently
2 Hyperdome Insp HISP3 Technopolis Generator Generator
2 Hyperdome Insp HISP4 Zenoa High Frequency Emulator Emulator Frequency Drop
3 Nano Dial Assem NDA11 Zenoa Fusion Tank Fall in Diffusion Ratio
3 Nano Dial Assem NDA12 Zenoa Dial Loading Unit Faulty Scanner Unit
3 Nano Dial Assem NDA13 Zenoa Vaccum Line Control Above Normal
3 Nano Dial Assem NDA14 Zenoa Wave Generator Generator Power Failure
4 Geometric Synth GeoSyn22 La Puente Scanning Electronic Faulty Scanner Unit
4 Geometric Synth GeoSyn23 La Puente Draft Synthesis Chamber Beam offset beyond Tolerance
4 Geometric Synth GeoSyn24 La Puente Progeometric Plane Progeometric Plane Fault Detected
4 Geometric Synth GeoSyn25 La Puente Ion Gas Diffuser Column Alignment Issues
CONDITIONS: 1) Data should be read csv file and randomize whole data. 2) It should also randomize "Location" column separately and print along with all randomize data. 3) Data should be generate more than 30k rows from given data. 4) Important- It should also read a "Asset Component" separately and randomize it as the value of the "Haptic Analyser" column- "Asset Family" will not mix with the value "Hyperdome Inspector" and "Nano Dial Assembler" and so on.. its means that It should be randomize column in a way that values of the "Asset Family" column should not match with the other values... If any doubt related with 4th condition please let me know..
For this i have written a program which will satisfy all the three conditions
import pandas as pd
import numpy as np
import random
import csv
def main():
df=pd.read_csv("C:\\Users\\rahul\\Desktop\\Data Manufacturing - Seed Data.csv")
ds = (df.sample(frac=1))
# print(ds)
loc=df.Location
# Here we are deleting location column and store it in loc variable
df=df.drop("Location",1)
# This way we can randomise location column
randValue = (loc.sample(frac=1))
randValue = randValue.to_frame()
#Now we will join the column randValue with whole data
result=ds.join(randValue, how='left', lsuffix='_left', rsuffix='')
# cols = list(result.columns.values)
# print("cols-",cols)
result = result[['Asset_Id ', 'Asset Family', 'Asset Name', 'Location', 'Asset Component','Keywords','Conditions','Parts','No. of Parts','SR_Id','SR_Date','SR_Month','SR_Year']]
#Now randomise the whole data again
ds1 = (result.sample(frac=1))
# print(ds1)
# Generating Large dataSet and randomize it
dd=ds1.append([ds1]*500)
ds2 = (dd.sample(frac=1))
print(ds2)
ds1.to_csv('C:\\Users\\rahul\\Desktop\\people1.csv')
if __name__ == '__main__':
main()
This program will generate large dataSet and randomize it and also randomize the Column "Location" But only thing i'm not able to do the 4th condition which will be randomize but according to the data which is in other column "Asset Family" values of "Haptic Analyser" and "Hyperdome Inspector" of "Asset Component " should not mix each other and print separately.
The output data:
Asset_Id Asset Family Asset Name Location Asset Component Keywords
3 Nano Dial Assem NDA11 Zenoa Fusion Tank Fall in Diffusion Ratio
1 Haptic Analy HAL3 Technopolis Rotation Chamber Rotation Chamber Intermittent Slowdown
2 Hyperdome Insp HISP2 Zenoa Turbo Quantifier Quantifier output Drops Intermittently
4 Geometric Synth GeoSyn25 La Puente Ion Gas Diffuser Column Alignment Issues
1 Haptic Analy HAL1 Zenoa Tetris Measuring Unit Measurement Inaccuracy,
2 Hyperdome Insp HISP1 Technopolis Laser Column Column Alignment Issues,
3 Nano Dial Assem NDA14 Zenoa Wave Generator Generator Power Failure
4 Geometric Synth GeoSyn24 La Puente Progeometric Plane Progeometric Plane Fault Detected
In this output all three conditions is given only 4th condition i'm able to do please help me to get it.. thanks in advance
Note : please go through my all conditions before coming to my coding part please if you are not able to understand any thing or any point please text in a comment box..thanks
Aucun commentaire:
Enregistrer un commentaire