Sample of Speech.txt:
Sample of Script.txt:Expected Output:Code (work in progress):The above code seems to only work for the first line in Speech.txt and then stops. I want it to run through the entire file i.e. line 2, line 3 …etc. I also haven’t figured out how to output the results into a text file. I can only print out the results at the moment. Any help would be appreciated!EDITLinks to Script.txt and Speech.txt. I have two text files: Speech.txt and Script.txt. Speech.txt contains a list of filenames of audio files and Script.txt contains the relevant transcript. Script.txt contains transcripts for all …You can load the lines into lists with the readlines() method and then iterate over them. This avoids the problem that Kuldeep Singh Sidhu correctly ifentified of the pointer reaching the end of the file. Using
is another approach as well since this seems like your typical join problem.
0x000f4a03.wav 0x000f4a07.wav 0x000f4a0f.wav
Then it is just a matter of selecting the columns you want and saving them out.
0x000f4a0f | | And unites the clans against Nilfgaard? 0x000f4a11 | | Of course. He's already decreed new longships be built. 0x000f4a03 | | Thinking long-term, then. Think she'll succeed? 0x000f4a05 | | She's got a powerful ally. In me. 0x000f4a07 | | Son's King of Skellige. Congratulations to you.
C:/Speech/0x000f4a03.wav|Thinking long-term, then. Think she'll succeed? C:/Speech/0x000f4a07.wav|Son's King of Skellige. Congratulations to you. C:/Speech/0x000f4a0f.wav|And unites the clans against Nilfgaard?
f1=open(r'C:/Speech.txt',"r", encoding='utf8') f2=open(r'C:/script.txt',"r", encoding='utf8') for line1 in f1: for line2 in f2: if line1[0:10]==line2[0:10]: print('C:/Speech/' + line2[0:10] + '.wav' + '|' + line2[26:-1]) f1.close() f2.close()
I would read the
contents into a dictionary, then use this dictionary as your iterate the lines from , and only print lines that exist. This avoids the need to iterate the file multiple times, which could be quite slow if you have large files.Demo:Output:Its also much easier to use With Statement Context Managers
to open your files, since you don’t need to call
to get the filename from your
f1=open(r'C:/Speech.txt',"r", encoding='utf8') f2=open(r'C:/script.txt',"r", encoding='utf8') lines1 = f1.readlines() lines2 = f2.readlines() f1.close() f2.close() with open("output.txt","w") as outfile: for line1 in lines1: for line2 in lines2: if line1[0:10]==line2[0:10]: outfile.write('C:/Speech/' + line2[0:10] + '.wav' + '|' + line2[26:-1],"/n")
files. I find this easier to use than the