mercredi 27 janvier 2021

randomizing numbers found in text doesn't find all numbers

My goal is to search a text, find all numbers, and randomize each number. The numbers are not written as digits, but as words. I've written a small test script, and this gives me the correct result. But in the actual text, my code doesn't find each word that is a number, it only finds one word. The text in my file is this:

    [{"text": "ZIP code for your address? Three, zero, zero five, eight. bye.", "s": 1090, "speaker": "Unknown", "e": 1787571}]

The code is like this:

    def digit_randomize_json(list_path, jsonDir, output):


        digits = ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", 
    "ten", "zero"]
        generated_digit = (random.choice(digits))

        with open (list_path, 'r') as list_file:
            for raw_filename in list_file:
                filename = raw_filename.rstrip()
                pathandfile = os.path.join(jsonDir, filename)
                f = open(pathandfile)
                data = json.load(f)
                f.close()
                words = data['transcript']['turns'][0]['text']
                words = words.split()

                for word in words:
                    word = word.lower()
                    if word in digits:
                        word = generated_digit

                        with open(os.path.join(output, filename), 'w' ) as outfile:
                            print("processing", filename)
                            json.dump(data, outfile)

What I would expect to have as a result is "[{"text": "ZIP code for your address? four, four, five nine, zero. bye.", "s": 1090, "speaker": "Unknown", "e": 1787571}]

Instead, I get a result that is unchanged. But even if I try inserting a print statement to print out the "word" variable, I'll only get one digit back.

Here's my test code that works:

    import random

    x = ['one','two','three','four','five','six']
    y = 'once i was three and then i was four'
    y_split = y.split()
    for word in y_split:
        if word in x:
            word = random.choice(x)
            print word

I've looked at these over and over and can't see what is the thing that is preventing the actual code from working. Can someone point out what I'm doing wrong?




Aucun commentaire:

Enregistrer un commentaire