dimanche 3 janvier 2016

When I shuffle a copy of a DataFrame, why the original DataFrame is also shuffled?

Here is the input,

    df1= pd.DataFrame(np.random.randn(10,3), columns= list("ABC") )
              A         B         C
    0  0.468682 -0.136178  0.418900
    1 -0.362995 -0.111931  0.433537
    2 -1.194483 -0.844683 -1.022719
    3  0.531893 -1.032088 -1.683009
    4  2.113807 -0.450628  0.004971
    5  0.141548 -0.621090 -0.135580
    6  0.128670 -0.460494 -0.016550
    7 -0.099141 -0.010140 -0.066042
    8  1.317759 -1.522207 -0.234447
    9 -0.039051 -1.395751 -0.431717

Then I create a copy of it. I assume I actually cloned the object not just creating a new link to it. I want to shuffle the copy of the original DataFrame while keep the original one untouched.

    df2=df1.copy(deep= True)

After I shuffled the df2,by doing this

    np.random.shuffle(df2.index.values)

Then I found both df2 and df1 are shuffled.

    df1.index
    Out[177]: Int64Index([7, 8, 0, 1, 3, 4, 6, 2, 5, 9], dtype='int64')

    df2.index
    Out[178]: Int64Index([7, 8, 0, 1, 3, 4, 6, 2, 5, 9], dtype='int64')

I am wondering why this approach failed and how to achieve what I want?




Aucun commentaire:

Enregistrer un commentaire